Swift passing protocol variable to generic function - swift

Can someone explain why is passing protocol var to a generic function an error in Swift?
protocol P {}
func f<T: P>(_: T) {}
func g(x: P) { f(x) } // Error
This, however, is not an error:
protocol P {}
func f(_: P) {}
func g(x: P) { f(x) }
I was just wondering what is the difference of the code generated by the compiler which makes it to reject the first example but in second case the generated code is good to go. Both seem to give the behavior I would expect.

Can someone explain why is passing protocol var to a generic function an error in Swift?
protocol P {}
func f<T: P>(_: T) {}
func g(x: P) { f(x) } // Error
It’s because currently non-#objc protocols don't conform to themselves. Therefore P cannot satisfy the generic placeholder T : P, as P is not a type that conforms to itself.
However in this particular example, that is, one where P doesn't have any static requirements, there's no fundamental limitation preventing P from conforming to itself (I explain this in more detail in the above linked Q&A). It's merely an implementation limitation.
What is the difference of the code generated by the compiler which makes it to reject the first example but in second case the generated code is good to go
Protocol-typed values (existentials) are implemented in a slightly different manner to generic-typed values constrained to a protocol.
A protocol-typed value P consists of:
An inline value buffer for the stored conforming value (currently 3 words in length, but is subject to change until ABI stability). If the value to store is more than 3 words in length, it's put into a heap allocated box, and a reference to this box is stored in the buffer.
A pointer to the conforming type's metadata.
A pointer to the protocol witness table for the conformance of the value to P, which lists the implementations to call for each of the protocol requirements.
On the flip side, a generic-typed value T where T : P consists of only the value buffer. The type metadata and witness table(s) are instead passed as implicit arguments to the generic function, and any member accesses or memory manipulations for values of type T can be done by consulting these arguments. Why? Because Swift's generics system ensures that two values of type T must be of the same type, so they must share the same conformance to the protocol constraint(s).
However this guarantee breaks down if we allow protocols to conform to themselves. Now, if T is a protocol type P, two values of type T could potentially have different underlying concrete types and therefore different conformances to P (so different protocol witness tables). We'd need to consult protocol witness tables on a per-value (rather than per-type) basis – just like we do with existentials.
So what we'd want is for generic-typed values to have the same layout as an existential of the protocol constraints. However this would make things pretty inefficient for the vast majority of the cases when the generic placeholder is not being satisfied by a protocol type, as values of type T would be carrying about redundant information.
The reason why #objc protocols are allowed to conform to themselves when they don't have static requirements is because they have a much simpler layout than non-#objc existentials – they just consist of a reference to the class instance, where protocol requirements are dispatched to via objc_msgSend. This layout is shared with that of a value typed as a placeholder T constrained to the protocol, which is why it's supported.

Related

Is there a penalty for changing a generic function's argument based on `MyProtocol` to using the existential `any MyProtocol` or `some MyProtocol`?

Normally in the discussion of things like some, they are referring to return types. This question is specifically about using any or some in argument lists.
As an example, in the Swift documentation for String, you have this initializer...
init<T>(_ value: T, radix: Int = 10, uppercase: Bool = false)
where T : BinaryInteger
In Swift 5.6, they introduced the any keyword to let us work with existential types more easily. With that change, I understand you theoretically can rewrite the above like so...
init(_ value: any BinaryInteger, radix: Int = 10, uppercase: Bool = false)
Of course there's also this version based on the some keyword, which also works...
init(_ value: some BinaryInteger, radix: Int = 10, uppercase: Bool = false)
My question is... which one makes the most sense? Is there a down-side to using the existential type over the generic like that? What about the some version? I was originally thinking yes, the generic version is best because the compiler can determine at compile time what's being passed to it, but then again, so does the existential any and even the some version as it won't compile if you don't pass it a BinaryInteger and I'm not quite sure how to write a test to check this out.
When passing a value of a protocol type P to a method in Swift, there are 4 possible spellings you can currently use:
func f<T: P>(_ value: T) ("generic")
func f(_ value: some P) ("opaque parameter")
func f(_ value: P) ("'bare' existential")
func f(_ value: any P) ("'explicit' existential")
Of these spellings, (1) and (2) are synonyms, and currently, (3) and (4) are synonyms. Between using (1) and (2), there is no difference, and between (3) and (4) there is currently* no difference; but between (1)/(2) and (3)/(4), there is a difference.
f<T: P>(_: T) is the traditional way of taking a parameter of a concrete type T which is guaranteed to conform to protocol P. This way of taking a parameter:
Gives you access to the concrete type T both at compile time and at runtime, so you can perform operations on T itself
Has no overhead, as the type T is known at compile time, and the compiler knows the size and layout of the value and can set up the stack/registers appropriately; it can pass whatever parameter was given to it directly to the method
Can only be called when the type of the argument is statically known (at compile time); but as such, can be called with protocol types with Self- or associatedtype requirements
Introduced in SE-0341 (Opaque Parameter Declarations), the version of a method which takes some Protocol is exactly equivalent to the generic version spelled with angle brackets. I'll avoid repeating the content of the proposal, but the Introduction section spells out the desire for this syntax as a way to simplify the complexity of spelling generic parameters
f(_: P) is the traditional way of taking a parameter of an existential type which is guaranteed to conform to protocol P. This way of taking a parameter:
Does not give access to the concrete underlying type of the parameter at compile time; though this is accessible dynamically at runtime via type(of:)
Has runtime overhead both to pass an argument to the method, and when accessing the value inside of the method: because the type of the argument to the method might not be known statically (while the compiler still needs to know how to set up the stack and registers in order to call the method), the parameter must be boxed up in an "existential box", which has the interface of P and can dynamically pass methods along to the underlying concrete value. This both has a cost of allocating an additional "box" at runtime to hold the actual value inside of a consistently-sized and -laid-out container, as well as the cost of indirecting as method calls on the box must dynamically dispatch to the underlying type
Can be called whether or not the type of the argument is known statically; as such, cannot be called with protocol types with Self- or associatedtype requirements
Introduced in SE-0335 (Existential Any), the any keyword preceding a protocol type is a marker that helps indicate that the type is being used as an existential. It is exactly equivalent right now to using the bare name of the protocol (i.e. any P == P), but there has been some discussion about eventually using the bare name of the protocol to mean some P instead
So to address your specific example:
init(_ value: some BinaryInteger, radix: Int = 10, uppercase: Bool = false)
is exactly equivalent to the original
init<T>(_ value: T, radix: Int = 10, uppercase: Bool = false)
where T : BinaryInteger
while
init(_ value: any BinaryInteger, radix: Int = 10, uppercase: Bool = false)
is not. And, because BinaryInteger has associatedtype requirements, you also cannot use the any version as the existential type would not provide access to the underlying associated types. (If you try, you'll get the classic error: protocol 'BinaryInteger' can only be used as a generic constraint because it has Self or associated type requirements)
In general, generics are preferable to existential types when possible because of the lack of overhead and greater flexibility; but, they require knowing the static type of their parameter, which is not always possible.
Existentials are more accepting of input, but are significantly more limited in functionality and come at a cost.
Between preferring <T: P> and some P, or P and any P — the choice is currently subjective, but:
As part of SE-0335, it is currently planned that Swift 6 will require the use of the any keyword to denote existential protocol usage; the counterpart to using this is some, so if you want to start future-proofing your code and staying consistent, it might be a good idea to start migrating to any and some
Between opaque parameter syntax and generic syntax, the choice is up to you, but opaque parameters cannot currently cover all of the use cases that generics can, especially with more complicated constraints. Whether sticking with generics everywhere or preferring opaque parameters and using generics only when necessary is a code style choice that you'll need to make

Relation between Existential Container and struct instance which conform protocol

I'm trying to understand how to find protocol method's implementation.
I know that Swift uses an Existential Container for fixed-size storage in Stack memory which manages how to describe an instance of a struct in memory. and it has a Value Witness Table (VWT) and Protocol Witness Table (PWT)
VWTs know how to manage real value in instance of struct (their lifecycle) and PWTs know the implementation of protocol's method.
But I want know the relation between the instance of a struct and the "existential container".
Does an instance of struct have a pointer which refers to an existential container?
How does an instance of a struct know its existential container?
Preface: I don't know how much background knowledge you have, so I might over-explain to make sure my answer is clear.
Also, I'm doing this to the best of my ability, off by memory. I might mix up some details, but hopefully this answer could at least point you towards further reading.
See also:
https://stackoverflow.com/a/41490551/3141234
https://github.com/apple/swift/blob/main/docs/SIL.rst#id201
In Swift, protocols can be used "as a type", or as a generic constraint. The latter case looks like so:
protocol SomeProtocol {}
struct SomeConformerSmall: SomeProtocol {
// No ivars
}
struct SomeConformerBig: SomeProtocol {
let a, b, c, d, e, f, g: Int // Lots of ivars
}
func fooUsingGenerics<T: SomeProtocol>(_: T) {}
let smallObject = SomeConformerSmall()
let bigObject = SomeConformerBig()
fooUsingGenerics(smallObject)
fooUsingGenerics(bigObject)
The protocol is used as a constraint for type-checking at compile time, but nothing particularly special happens at runtime (for the most part). Most of the time, the compiler will produced monomorphized variants of the foo function, as if you had defined fooUsingGenerics(_: SomeConformerSmall) or fooUsingGenerics(_: SomeConformerBig) to begin with.
When a protocol is "used like a type", it would look like this:
func fooUsingProtcolExistential(_: SomeProtocol) {}
fooUsingGenerics(smallObject)
fooUsingGenerics(bigObject)
As you see, this function can be called using both smallObject and bigObject. The problem is that these two objects have different sizes. This is a problem: how will the compiler know how much stack space is necessary to allocate for the arguments of this function, if the arguments can be different sizes? It must do something to help fooUsingProtcolExistential accommodate that.
Existential containers are the solution. When you pass a value where a protocol type is expected, the Swift compiler will generate code that automagically boxes that value into an "existential container" for you. As currently defined, an existential container is 4 words in size:
The first word is a pointer to the Protocol Witness Table (more on this later)
The next three words are inline storage for the value.
When the value being stored is less than 3 words in size (e.g. SomeConformerSmall), the value is packed directly inline into that 3 word buffer. If the value is more than 3 words in size (e.g. SomeConformerSmall), a ARC-managed box is allocated on the heap, and the value is copied into there. A pointer to this box is then copied into the first word of the existential container (the last 2 words are unused, IIRC).
This introduces a new issue: suppose that fooUsingProtcolExistential wanted to forward along its parameter to another function. How should it pass the EC? fooUsingProtcolExistential doesn't know whether the EC contains a value-inline (in which case, passing the EC just entails copying its 4 words of memory), or heap-allocated (in which case, passing the EC also requires an ARC retain on that heap-allocated buffer).
To remedy this, the Protocol Witness Table contains a pointer to a Value Witness Table (VWT). Each VWT defines the a standard set of function pointers, that define how the EC can be allocated, copied, deleted, etc. Whenever a protocol existential needs to be manipulated in someway, the VWT defines exactly how to do so.
So now we have a constant-size container (which solves our heterogeneously-sized parameter passing problem), and a way to move the container around. What can we actually do with it?
Well at a minimum, values of this protocol type must at least define the required members (initializers, properties (stored or computed), functions and subscripts) that the protocol defines.
But each conforming type might implement these members in a different way. E.g. some struct might satisfy a method requirement by defining the method directly, but another class might satisfy it by inheriting the method from a superclass. Some might implement a property as a stored property, others as a computed property, etc.
Handling these incompatibilities is the primary purpose of the Protocol Witness Table. There's one of these tables per protocol conformance (e.g. one for SomeConformerSmall and one for SomeConformerBig). They contain a set of function pointers with point to the implementations of the protocols' requirements. While the pointed-to functions might be in different places, the PWT's layout is consistent for the protocol is conforms to. As a result, fooUsingProtcolExistential is able to look at the PWT of an EC, and use it to find the implementation of a protocol method, and call it.
So in short:
An EC contains a PWT and a value (inline or indirect)
A PWT points to a VWT
My understanding:
Struct doesn't know where existential container/value witness table/protocol witness table is, the compiler knows. If needed somewhere, compiler pass them to there.

Differences generic protocol type parameter vs direct protocol type

This is my playground code:
protocol A {
init(someInt: Int)
}
func direct(a: A) {
// Doesn't work
let _ = A.init(someInt: 1)
}
func indirect<T: A>(a: T) {
// Works
let _ = T.init(someInt: 1)
}
struct B: A {
init(someInt: Int) {
}
}
let a: A = B(someInt: 0)
// Works
direct(a: a)
// Doesn't work
indirect(a: a)
It gives a compile time error when calling method indirect with argument a. So I understand <T: A> means some type that conforms to A. The type of my variable a is A and protocols do not conform to themselfs so ok, I understand the compile time error.
The same applies for the compile time error inside method direct. I understand it, a concrete conforming type needs to inserted.
A compile time also arrises when trying to access a static property in direct.
I am wondering. Are there more differences in the 2 methods that are defined? I understand that I can call initializers and static properties from indirect and I can insert type A directly in direct and respectively, I can not do what the other can do. But is there something I missed?
The key confusion is that Swift has two concepts that are spelled the same, and so are often ambiguous. One of the is struct T: A {}, which means "T conforms to the protocol A," and the other is var a: A, which means "the type of variable a is the existential of A."
Conforming to a protocol does not change a type. T is still T. It just happens to conform to some rules.
An "existential" is a compiler-generated box the wraps up a protocol. It's necessary because types that conform to a protocol could be different sizes and different memory layouts. The existential is a box that gives anything that conforms to protocol a consistent layout in memory. Existentials and protocols are related, but not the same thing.
Because an existential is a run-time box that might hold any type, there is some indirection involved, and that can introduce a performance impact and prevents certain optimizations.
Another common confusion is understanding what a type parameter means. In a function definition:
func f<T>(param: T) { ... }
This defines a family of functions f<T>() which are created at compile time based on what you pass as the type parameter. For example, when you call this function this way:
f(param: 1)
a new function is created at compile time called f<Int>(). That is a completely different function than f<String>(), or f<[Double]>(). Each one is its own function, and in principle is a complete copy of all the code in f(). (In practice, the optimizer is pretty smart and may eliminate some of that copying. And there are some other subtleties related to things that cross module boundaries. But this is a pretty decent way to think about what is going on.)
Since specialized versions of generic functions are created for each type that is passed, they can in theory be more optimized, since each version of the function will handle exactly one type. The trade-off is that they can add code-bloat. Do not assume "generics are faster than protocols." There are reasons that generics may be faster than protocols, but you have to actually look at the code generation and profile to know in any particular case.
So, walking through your examples:
func direct(a: A) {
// Doesn't work
let _ = A.init(someInt: 1)
}
A protocol (A) is just a set of rules that types must conform to. You can't construct "some unknown thing that conforms to those rules." How many bytes of memory would be allocated? What implementations would it provide to the rules?
func indirect<T: A>(a: T) {
// Works
let _ = T.init(someInt: 1)
}
In order to call this function, you must pass a type parameter, T, and that type must conform to A. When you call it with a specific type, the compiler will create a new copy of indirect that is specifically designed to work with the T you pass. Since we know that T has a proper init, we know the compiler will be able to write this code when it comes time to do so. But indirect is just a pattern for writing functions. It's not a function itself; not until you give it a T to work with.
let a: A = B(someInt: 0)
// Works
direct(a: a)
a is an existential wrapper around B. direct() expects an existential wrapper, so you can pass it.
// Doesn't work
indirect(a: a)
a is an existential wrapper around B. Existential wrappers do not conform to protocols. They require things that conform to protocols in order to create them (that's why they're called "existentials;" the fact that you created one proves that such a value actually exists). But they don't, themselves, conform to protocols. If they did, then you could do things like what you've tried to do in direct() and say "make a new instance of an existential wrapper without knowing exactly what's inside it." And there's no way to do that. Existential wrappers don't have their own method implementations.
There are cases where an existential could conform to its own protocol. As long as there are no init or static requirements, there actually isn't a problem in principle. But Swift can't currently handle that. Because it can't work for init/static, Swift currently forbids it in all cases.

What's the difference between using a generic where condition and specifying argument type? [duplicate]

This question already has answers here:
What is the in-practice difference between generic and protocol-typed function parameters?
(2 answers)
Closed 3 years ago.
What advantages are there to using generics with a where clause over specifying a protocol for an argument, as in the following function signatures?
func encode<T>(_ value: T) throws -> Data where T : Encodable {...}
func encode(value: Encodable) throws -> Data {...}
The first is a generic method that requires a concrete type that conforms to Encodable. That means for each call to encode with a different type, a completely new copy of the function may be created, optimized just for that concrete type. In some cases the compiler may remove some of these copies, but in principle encode<Int>() is a completely different function than encode<String>(). It's a (generic) system for creating functions at compile time.
In contrast, the second is a non-generic function that accepts a parameter of the "Encodable existential" type. An existential is a compiler-generated box that wraps some other type. In principle this means that the value will be copied into the box at run time before being passed, possibly requiring a heap allocation if it's too large for the box (again, it may not be because the compiler is very smart and can sometimes see that it's unnecessary).
This ambiguity between the name of the protocol and the name of the existential will hopefully be fixed in the future (and there's discussion about doing so). In the future, the latter function will hopefully be spelled (note "any"):
func encode(value: any Encodable) throws -> Data {...}
The former might be faster. It might also take more space for all the copies of the function. (But see above about the compiler. Do not assume you know which of these will be faster in an actual, optimized build.)
The former provides a real, concrete type. That means it can be used for things that require a real, concrete type, such as calling a static method, or init. This means it can be used when the protocol has an associated type.
The latter is boxed into an existential, meaning it can be stored into heterogeneous collections. The former can only be put into collections of its particular concrete type.
So they're pretty different things, and each has its purpose.
You can use multiple type constraints.
func encode<T>(encodable: T) -> Data where T: Encodable, T: Decodable {
...
}

In Swift, from a technical standpoint, why does the compiler care if a protocol can only be used as a generic constraint?

In Swift, from a technical standpoint, why does the compiler care if a protocol can only be used as a generic constraint?
Say I have:
protocol Fooable {
associated type Bar: Equatable
func foo(bar: Bar) {
bar==bar
}
}
Why can't I later declare a func that takes a Fooable object as an argument?
It seems to me that the compiler should only care that a given Fooable can be sent a message "foo" with an argument "bar" that is an Equatable and therefore responds to the message, "==".
I understand Swift is statically typed, but why should Swift even really care about type in this context, since the only thing that matters is whether or not a given message can be validly sent to an object?
I am trying to understand the why behind this, since I suspect there must be a good reason.
In your example above, if you wrote a function that takes a Fooable parameter, e.g.
func doSomething(with fooable:Fooable) {
fooable.foo(bar: ???) // what type is allowed to be passed here? It's not _any_ Equatable, it's the associated type Bar, which here would be...???
}
What type could be passed into fooable.foo(bar:)? It can't be any Equatable, it must be the specific associated type Bar.
Ultimately, it boils down the the problem that protocols that reference "Self" or have an associated type end up having different interfaces based on which concrete implementation has conformed (i.e. the specific type for Self, or the specific associated type). So these protocols can be considered as missing information about types and signatures needed to address them directly, but still serve as templates for conforming types and so can be used as generic constraints.
For example, the compiler would accept the function written like this:
func doSomething<T: Fooable>(with fooable:T, bar: T.Bar) {
fooable.foo(bar: bar)
}
In this scenario we aren't trying to address the Fooable protocol as a protocol. Instead, we are accepting any concrete type, T, that is itself constrained to conform to Fooable. But the compiler will know the exact concrete type of T each time you call the function, so it therefore will know the exact associated type Bar, and will know exactly what type may be passed as a parameter to fooable.foo(bar:)
More Details
It may help to think of "generic protocols" — i.e. protocols that have an associated type, including possibly the type Self — as something a little different than normal protocols. Normal protocols define messaging requirements, as you say, and can be used to abstract away a specific implementation and address any conforming type as the protocol itself.
Generic protocols are better understood as part of the generics system in Swift rather than as normal protocols. You can't cast to a generic protocol like Equatable (no if let equatable = something as? Equatable) because as part of the generic system, Equatable must be specialized and understood at compile time. More on this below.
What you do get from generic protocols that is the same as normal protocols is the concept of a contract that conforming types must adhere to. By saying associatedtype Bar: Equatable you are getting a contract that the type Bar will provide a way for you to call `func ==(left:Bar, right: Bar) -> Bool'. It requires conforming types to provide a certain interface.
The difference between generic protocols and normal protocols is that you can cast to and message normal protocols as the protocol type (not a concrete type), but you must always address conformers to generic protocols by their concrete type (just like all generics). This means normal protocols are a runtime feature (for dynamic casting) as well as a compile time feature (for type checking). But generic protocols are only a compile time feature (no dynamic casting).
Why can't we say var a:Equatable? Well, let's dig in a little. Equatable means that one instance of a specific type can be compared for equality with another instance of the same type. I.e. func ==(left:A, right:A) -> Bool. If Equatable were a normal protocol, you would say something more like: func ==(left:Equatable, right:Equatable) -> Bool. But if you think about that, it doesn't make sense. String is Equatable with other Strings, Int is Equatable with other Ints, but that doesn't in any way mean Strings are Equatable with Ints. If the Equatable protocol just required the implementation of func ==(left:Equatable, right:Equatable) -> Bool for your type, how could you possibly write that function to compare your type to every other possible Equatable type now and in the future?
Since that's not possible, Equatable requires only that you implement == for two instances of Self type. So if Foo: Equatable, then you must only define == for two instances of Foo.
Now let's look at the problem with var a:Equatable. This seems to make sense at first, but in fact, it doesn't:
var a: Equatable = "A String"
var b: Equatable = 100
let equal = a == b
Since both a and b are Equatable, we could be able to compare them for equality, right? But in fact, a's equality implementation is limited to comparing a String to a String and b's equality implementation is limited to comparing an Int to an Int. So it's best to think of generic protocols more like other generics to realize that Equatable<String> is not the same protocol as Equatable<Int> even though they are both supposedly just "Equatable".
As for why you can have a dictionary of type [AnyHashable: Any], but not [Hashable: Any], this is becoming more clear. The Hashable protocol inherits from Equatable, so it is a "generic protocol". That means for any Hashable type, there must be a func ==(left: Self, right:Self) -> Bool. Dictionaries use both the hashValue and equality comparisons to store and retrieve keys. But how can a dictionary compare a String key and an Int key for equality, even if they both conform to Hashable / Equatable? It can't. Therefore, you need to wrap your keys in a special "type eraser" called AnyHashable. How type erasers work is too detailed for the scope of this question, but suffice it to say that a type eraser like AnyHashable gets instantiated with some type T: Hashable, and then forward requests for a hashValue to its wrapped type, and implements ==(left:AnyHashable, right: AnyHashable) -> Bool in a way that also uses the wrapped type's equality implementation. I think this gist should give a great illustration of how you can implement an "AnyEquatable" type eraser.
https://gist.github.com/JadenGeller/f0d05a4699ddd477a2c1
Moving onward, because AnyHashable is a single concrete type (not a generic type like the Hashable protocol is), you can use it to define a dictionary. Because every single instance of AnyHashable can wrap a different Hashable type (String, Int, whatever), and can also produce a hashValue and be checked for equality with any other AnyHashable instance, it's exactly what a dictionary needs for its keys.
So, in a sense, type erasers like AnyHashable are a sort of implementation trick that turns a generic protocol into something like a normal protocol. By erasing / throwing away the generic associated type information, but keeping the required methods, you can effectively abstract the specific conformance of Hashable into general type "AnyHashable" that can wrap anything Hashable, but be used in non-generic circumstances.
This may all come together if you review that gist for creating an implementation of "AnyEquatable": https://gist.github.com/JadenGeller/f0d05a4699ddd477a2c1 and then go back an see how you can now turn this impossible / non-compiling code from earlier:
var a: Equatable = "A String"
var b: Equatable = 100
let equal = a == b
Into this conceptually similar, but actually valid code:
var a: AnyEquatable = AnyEquatable("A String")
var b: AnyEquatable = AnyEquatable(100)
let equal = a == b