How does closure store references to constants or values

How does closure store references to constants or values - swift

The description comes from the swift office document
Closures can capture and store references to any constants and variables from the context in which they are defined. This is known as closing over those constants and variables.
I don't fully understand the store references part. Does it means It creates some sort of "Pointer" to a variable? If the value changed the "de-referenced pointer" also changed to the new value. I think Swift has no pointer concept. I borrowed it to express my understandings.
Would be very nice that someone can provide a simple code lines to explain how the closure stores reference to the constants/variables.

Does it means It creates some sort of "Pointer" to a variable? If the
value changed the "de-referenced pointer" also changed to the new
value.
Yes.
I think Swift has no pointer concept.
It most certainly does, in the (implicit) form of reference types (classes), and the (explicit) form of UnsafePointer
Here's an example of a variable being captured in a closure. This all happens to be single threaded, but you could have this closure be dispatched by grand central dispatch. x is captured so that it exists for the entire life of the closure, even if the closure exists after x's declaring scope (f()) exits.
func f() {
var x = 0 //captured variable
_ = {
x = 5 //captures x from parent scope, and modifies it
}()
print(x) //prints 5, as x was modified by the closure
}
f()
I wrote another answer that explains the relationships between functions, closures, and other terms. It's worth reading, IMO.

Related

Does passing struct into a function also using copy-on-write for optimisation, or actual copy does happen immediately? [duplicate]

In Swift, when you pass a value type, say an Array to a function. A copy of the array is made for the function to use.
However the documentation at https://developer.apple.com/library/ios/documentation/Swift/Conceptual/Swift_Programming_Language/ClassesAndStructures.html#//apple_ref/doc/uid/TP40014097-CH13-XID_134 also says:
The description above refers to the “copying” of strings, arrays, and
dictionaries. The behavior you see in your code will always be as if a
copy took place. However, Swift only performs an actual copy behind
the scenes when it is absolutely necessary to do so. Swift manages all
value copying to ensure optimal performance, and you should not avoid
assignment to try to preempt this optimization.
So does it mean that the copying actually only takes placed when the passed value type is modified?
Is there a way to demonstrate that this is actually the underlying behavior?
Why this is important? If I create a large immutable array and want to pass it in from function to function, I certainly do not want to keep making copies of it. Should I just use NSArrray in this case or would the Swift Array work fine as long as I do not try to manipulate the passed in Array?
Now as long as I do not explicitly make the variables in the function editable by using var or inout, then the function can not modify the array anyway. So does it still make a copy? Granted that another thread can modify the original array elsewhere (only if it is mutable), making a copy at the moment the function is called necessary (but only if the array passed in is mutable). So if the original array is immutable and the function is not using var or inout, there is no point in Swift creating a copy. Right? So what does Apple mean by the phrase above?

TL;DR:
So does it mean that the copying actually only takes placed when the passed value type is modified?
Yes!
Is there a way to demonstrate that this is actually the underlying behavior?
See the first example in the section on the copy-on-write optimization.
Should I just use NSArrray in this case or would the Swift Array work fine
as long as I do not try to manipulate the passed in Array?
If you pass your array as inout, then you'll have a pass-by-reference semantics,
hence obviously avoiding unnecessary copies.
If you pass your array as a normal parameter,
then the copy-on-write optimization will kick in and you shouldn't notice any performance drop
while still benefiting from more type safety that what you'd get with a NSArray.
Now as long as I do not explicitly make the variables in the function editable
by using var or inout, then the function can not modify the array anyway.
So does it still make a copy?
You will get a "copy", in the abstract sense.
In reality, the underlying storage will be shared, thanks to the copy-on-write mechanism,
hence avoiding unnecessary copies.
If the original array is immutable and the function is not using var or inout,
there is no point in Swift creating a copy. Right?
Exactly, hence the copy-on-write mechanism.
So what does Apple mean by the phrase above?
Essentially, Apple means that you shouldn't worry about the "cost" of copying value types,
as Swift optimizes it for you behind the scene.
Instead, you should just think about the semantics of value types,
which is that get a copy as soon as you assign or use them as parameters.
What's actually generated by Swift's compiler is the Swift's compiler business.
Value types semantics
Swift does indeed treat arrays as value types (as opposed to reference types),
along with structures, enumerations and most other built-in types
(i.e. those that are part of the standard library and not Foundation).
At the memory level, these types are actually immutable plain old data objects (POD),
which enables interesting optimizations.
Indeed, they are typically allocated on the stack rather than the heap [1],
(https://en.wikipedia.org/wiki/Stack-based_memory_allocation).
This allows the CPU to very efficiently manage them,
and to automatically deallocate their memory as soon as the function exits [2],
without the need for any garbage collection strategy.
Values are copied whenever assigned or passed as a function.
This semantics has various advantages,
such as avoiding the creation of unintended aliases,
but also as making it easier for the compiler to guarantee the lifetime of values
stored in a another object or captured by a closure.
We can think about how hard it can be to manage good old C pointers to understand why.
One may think it's an ill-conceived strategy,
as it involves copying every single time a variable is assigned or a function is called.
But as counterintuitive it may be,
copying small types is usually quite cheap if not cheaper than passing a reference.
After all, a pointer is usually the same size as an integer...
Concerns are however legitimate for large collections (i.e. arrays, sets and dictionaries),
and very large structures to a lesser extent [3].
But the compiler has has a trick to handle these, namely copy-on-write (see later).
What about mutating
Structures can define mutating methods,
which are allowed to mutate the fields of the structure.
This doesn't contradict the fact that value types are nothing more than immutable PODs,
as in fact calling a mutating method is merely a huge syntactic sugar
for reassigning a variable to a brand new value that's identical to the previous ones,
except for the fields that were mutated.
The following example illustrates this semantical equivalence:
struct S {
var foo: Int
var bar: Int
mutating func modify() {
foo = bar
}
}
var s1 = S(foo: 0, bar: 10)
s1.modify()
// The two lines above do the same as the two lines below:
var s2 = S(foo: 0, bar: 10)
s2 = S(foo: s2.bar, bar: s2.bar)
Reference types semantics
Unlike value types, reference types are essentially pointers to the heap at the memory level.
Their semantics is closer to what we would get in reference-based languages,
such as Java, Python or Javascript.
This means they do not get copied when assigned or passed to a function, their address is.
Because the CPU is no longer able to manage the memory of these objects automatically,
Swift uses a reference counter to handle garbage collection behind the scenes
(https://en.wikipedia.org/wiki/Reference_counting).
Such semantics has the obvious advantage to avoid copies,
as everything is assigned or passed by reference.
The drawback is the danger of unintended aliases,
as in almost any other reference-based language.
What about inout
An inout parameter is nothing more than a read-write pointer to the expected type.
In the case of value types, it means the function won't get a copy of the value,
but a pointer to such values,
so mutations inside the function will affect the value parameter (hence the inout keyword).
In other terms, this gives value types parameters a reference semantics in the context of the function:
func f(x: inout [Int]) {
x.append(12)
}
var a = [0]
f(x: &a)
// Prints '[0, 12]'
print(a)
In the case of reference types, it will make the reference itself mutable,
pretty much as if the passed argument was a the address of the address of the object:
func f(x: inout NSArray) {
x = [12]
}
var a: NSArray = [0]
f(x: &a)
// Prints '(12)'
print(a)
Copy-on-write
Copy-on-write (https://en.wikipedia.org/wiki/Copy-on-write) is an optimization technique that
can avoid unnecessary copies of mutable variables,
which is implemented on all Swift's built-in collections (i.e. array, sets and dictionaries).
When you assign an array (or pass it to a function),
Swift doesn't make a copy of the said array and actually uses a reference instead.
The copy will take place as soon as the your second array is mutated.
This behavior can be demonstrated with the following snippet (Swift 4.1):
let array1 = [1, 2, 3]
var array2 = array1
// Will print the same address twice.
array1.withUnsafeBytes { print($0.baseAddress!) }
array2.withUnsafeBytes { print($0.baseAddress!) }
array2[0] = 1
// Will print a different address.
array2.withUnsafeBytes { print($0.baseAddress!) }
Indeed, array2 doesn't get a copy of array1 immediately,
as shown by the fact it points to the same address.
Instead, the copy is triggered by the mutation of array2.
This optimization also happens deeper in the structure,
meaning that if for instance your collection is made of other collections,
the latter will also benefit from the copy-on-write mechanism,
as demonstrated by the following snippet (Swift 4.1):
var array1 = [[1, 2], [3, 4]]
var array2 = array1
// Will print the same address twice.
array1[1].withUnsafeBytes { print($0.baseAddress!) }
array2[1].withUnsafeBytes { print($0.baseAddress!) }
array2[0] = []
// Will print the same address as before.
array2[1].withUnsafeBytes { print($0.baseAddress!) }
Replicating copy-on-write
It is in fact rather easy to implement the copy-on-write mechanism in Swift,
as some of the its reference counter API is exposed to the user.
The trick consists of wrapping a reference (e.g. a class instance) within a structure,
and to check whether that reference is uniquely referenced before mutating it.
When that's the case, the wrapped value can be safely mutated,
otherwise it should be copied:
final class Wrapped<T> {
init(value: T) { self.value = value }
var value: T
}
struct CopyOnWrite<T> {
init(value: T) { self.wrapped = Wrapped(value: value) }
var wrapped: Wrapped<T>
var value: T {
get { return wrapped.value }
set {
if isKnownUniquelyReferenced(&wrapped) {
wrapped.value = newValue
} else {
wrapped = Wrapped(value: newValue)
}
}
}
}
var a = CopyOnWrite(value: SomeLargeObject())
// This line doesn't copy anything.
var b = a
However, there is an import caveat here!
Reading the documentation for isKnownUniquelyReferenced we get this warning:
If the instance passed as object is being accessed by multiple threads simultaneously,
this function may still return true.
Therefore, you must only call this function from mutating methods
with appropriate thread synchronization.
This means the implementation presented above isn't thread safe,
as we may encounter situations where it'd wrongly assumes the wrapped object can be safely mutated,
while in fact such mutation would break invariant in another thread.
Yet this doesn't mean Swift's copy-on-write is inherently flawed in multithreaded programs.
The key is to understand what "accessed by multiple threads simultaneously" really means.
In our example, this would happen if the same instance of CopyOnWrite was shared across multiple threads,
for instance as part of a shared global variable.
The wrapped object would then have a thread safe copy-on-write semantics,
but the instance holding it would be subject to data race.
The reason is that Swift must establish unique ownership
to properly evaluate isKnownUniquelyReferenced [4],
which it can't do if the owner of the instance is itself shared across multiple threads.
Value types and multithreading
It is Swift's intention to alleviate the burden of the programmer
when dealing with multithreaded environments, as stated on Apple's blog
(https://developer.apple.com/swift/blog/?id=10):
One of the primary reasons to choose value types over reference types
is the ability to more easily reason about your code.
If you always get a unique, copied instance,
you can trust that no other part of your app is changing the data under the covers.
This is especially helpful in multi-threaded environments
where a different thread could alter your data out from under you.
This can create nasty bugs that are extremely hard to debug.
Ultimately, the copy-on-write mechanism is a resource management optimization that,
like any other optimization technique,
one shouldn't think about when writing code [5].
Instead, one should think in more abstract terms
and consider values to be effectively copied when assigned or passed as arguments.
[1]
This holds only for values used as local variables.
Values used as fields of a reference type (e.g. a class) are also stored in the heap.
[2]
One could get confirmation of that by checking the LLVM byte code that's produced
when dealing with value types rather than reference types,
but the Swift compiler being very eager to perform constant propagation,
building a minimal example is a bit tricky.
[3]
Swift doesn't allow structures to reference themselves,
as the compiler would be unable to compute the size of such type statically.
Therefore, it is not very realistic to think of a structure that is so large
that copying it would become a legitimate concern.
[4]
This is, by the way, the reason why isKnownUniquelyReferenced accepts an inout parameter,
as it's currently Swift's way to establish ownership.
[5]
Although passing copies of value-type instances should be safe,
there's a open issue that suggests some problems with the current implementation
(https://bugs.swift.org/browse/SR-6543).

I don't know if that's the same for every value type in Swift, but for Arrays I'm pretty sure it's a copy-on-write, so it doesn't copy it unless you modify it, and as you said if you pass it around as a constant you don't run that risk anyway.
p.s. In Swift 1.2 there are new APIs you can use to implement copy-on-write on your own value-types too

Cannot retrieve associated object [duplicate]

The code below can be run in a Swift Playground:
import UIKit
func aaa(_ key: UnsafeRawPointer!, _ value: Any! = nil) {
print(key)
}
func bbb(_ key: UnsafeRawPointer!) {
print(key)
}
class A {
var key = "aaa"
}
let a = A()
aaa(&a.key)
bbb(&a.key)
Here's the result printed on my mac:
0x00007fff5dce9248
0x00007fff5dce9220
Why the results of two prints differs? What's more interesting, when I change the function signature of bbb to make it the same with aaa, the result of two prints are the same. And if I use a global var instead of a.key in these two function calls, the result of two prints are the same. Does anyone knows why this strange behavior happens?

Why the results of two prints differs?
Because for each function call, Swift is creating a temporary variable initialised to the value returned by a.key's getter. Each function is called with a pointer to their given temporary variable. Therefore the pointer values will likely not be the same – as they refer to different variables.
The reason why temporary variables are used here is because A is a non-final class, and can therefore have its getters and setters of key overridden by subclasses (which could well re-implement it as a computed property).
Therefore in an un-optimised build, the compiler cannot just pass the address of key directly to the function, but instead has to rely on calling the getter (although in an optimised build, this behaviour can change completely).
You'll note that if you mark key as final, you should now get consistent pointer values in both functions:
class A {
final var key = "aaa"
}
var a = A()
aaa(&a.key) // 0x0000000100a0abe0
bbb(&a.key) // 0x0000000100a0abe0
Because now the address of key can just be directly passed to the functions, bypassing its getter entirely.
It's worth noting however that, in general, you should not rely on this behaviour. The values of the pointers you get within the functions are a pure implementation detail and are not guaranteed to be stable. The compiler is free to call the functions however it wishes, only promising you that the pointers you get will be valid for the duration of the call, and will have pointees initialised to the expected values (and if mutable, any changes you make to the pointees will be seen by the caller).
The only exception to this rule is the passing of pointers to global and static stored variables. Swift does guarantee that the pointer values you get will be stable and unique for that particular variable. From the Swift team's blog post on Interacting with C Pointers (emphasis mine):
However, interaction with C pointers is inherently
unsafe compared to your other Swift code, so care must be taken. In
particular:
These conversions cannot safely be used if the callee
saves the pointer value for use after it returns. The pointer that
results from these conversions is only guaranteed to be valid for the
duration of a call. Even if you pass the same variable, array, or
string as multiple pointer arguments, you could receive a different
pointer each time. An exception to this is global or static stored
variables. You can safely use the address of a global variable as a
persistent unique pointer value, e.g.: as a KVO context parameter.
Therefore if you made key a static stored property of A or just a global stored variable, you are guaranteed to the get same pointer value in both function calls.
Changing the function signature
When I change the function signature of bbb to make it the same with aaa, the result of two prints are the same
This appears to be an optimisation thing, as I can only reproduce it in -O builds and playgrounds. In an un-optimised build, the addition or removal of an extra parameter has no effect.
(Although it's worth noting that you should not test Swift behaviour in playgrounds as they are not real Swift environments, and can exhibit different runtime behaviour to code compiled with swiftc)
The cause of this behaviour is merely a coincidence – the second temporary variable is able to reside at the same address as the first (after the first is deallocated). When you add an extra parameter to aaa, a new variable will be allocated 'between' them to hold the value of the parameter to pass, preventing them from sharing the same address.
The same address isn't observable in un-optimised builds due to the intermediate load of a in order to call the getter for the value of a.key. As an optimisation, the compiler is able to inline the value of a.key to the call-site if it has a property initialiser with a constant expression, removing the need for this intermediate load.
Therefore if you give a.key a non-determininstic value, e.g var key = arc4random(), then you should once again observe different pointer values, as the value of a.key can no longer be inlined.
But regardless of the cause, this is a perfect example of how the pointer values for variables (which are not global or static stored variables) are not to be relied on – as the value you get can completely change depending on factors such as optimisation level and parameter count.
inout & UnsafeMutable(Raw)Pointer
Regarding your comment:
But since withUnsafePointer(to:_:) always has the correct behavior I want (in fact it should, otherwise this function is of no use), and it also has an inout parameter. So I assume there are implementation difference between these functions with inout parameters.
The compiler treats an inout parameter in a slightly different way to an UnsafeRawPointer parameter. This is because you can mutate the value of an inout argument in the function call, but you cannot mutate the pointee of an UnsafeRawPointer.
In order to make any mutations to the value of the inout argument visible to the caller, the compiler generally has two options:
Make a temporary variable initialised to the value returned by the variable's getter. Call the function with a pointer to this variable, and once the function has returned, call the variable's setter with the (possibly mutated) value of the temporary variable.
If it's addressable, simply call the function with a direct pointer to the variable.
As said above, the compiler cannot use the second option for stored properties that aren't known to be final (but this can change with optimisation). However, always relying on the first option can be potentially expensive for large values, as they'll have to be copied. This is especially detrimental for value types with copy-on-write behaviour, as they depend on being unique in order to perform direct mutations to their underlying buffer – a temporary copy violates this.
To solve this problem, Swift implements a special accessor – called materializeForSet. This accessor allows the callee to either provide the caller with a direct pointer to the given variable if it's addressable, or otherwise will return a pointer to a temporary buffer containing a copy of the variable, which will need to be written back to the setter after it has been used.
The former is the behaviour you're seeing with inout – you're getting a direct pointer to a.key back from materializeForSet, therefore the pointer values you get in both function calls are the same.
However, materializeForSet is only used for function parameters that require write-back, which explains why it's not used for UnsafeRawPointer. If you make the function parameters of aaa and bbb take UnsafeMutable(Raw)Pointers (which do require write-back), you should observe the same pointer values again.
func aaa(_ key: UnsafeMutableRawPointer) {
print(key)
}
func bbb(_ key: UnsafeMutableRawPointer) {
print(key)
}
class A {
var key = "aaa"
}
var a = A()
// will use materializeForSet to get a direct pointer to a.key
aaa(&a.key) // 0x0000000100b00580
bbb(&a.key) // 0x0000000100b00580
But again, as said above, this behaviour is not to be relied upon for variables that are not global or static.

Why UnsafeRawPointer shows different result when function signatures differs in Swift?

The code below can be run in a Swift Playground:
import UIKit
func aaa(_ key: UnsafeRawPointer!, _ value: Any! = nil) {
print(key)
}
func bbb(_ key: UnsafeRawPointer!) {
print(key)
}
class A {
var key = "aaa"
}
let a = A()
aaa(&a.key)
bbb(&a.key)
Here's the result printed on my mac:
0x00007fff5dce9248
0x00007fff5dce9220
Why the results of two prints differs? What's more interesting, when I change the function signature of bbb to make it the same with aaa, the result of two prints are the same. And if I use a global var instead of a.key in these two function calls, the result of two prints are the same. Does anyone knows why this strange behavior happens?

Why the results of two prints differs?
Because for each function call, Swift is creating a temporary variable initialised to the value returned by a.key's getter. Each function is called with a pointer to their given temporary variable. Therefore the pointer values will likely not be the same – as they refer to different variables.
The reason why temporary variables are used here is because A is a non-final class, and can therefore have its getters and setters of key overridden by subclasses (which could well re-implement it as a computed property).
Therefore in an un-optimised build, the compiler cannot just pass the address of key directly to the function, but instead has to rely on calling the getter (although in an optimised build, this behaviour can change completely).
You'll note that if you mark key as final, you should now get consistent pointer values in both functions:
class A {
final var key = "aaa"
}
var a = A()
aaa(&a.key) // 0x0000000100a0abe0
bbb(&a.key) // 0x0000000100a0abe0
Because now the address of key can just be directly passed to the functions, bypassing its getter entirely.
It's worth noting however that, in general, you should not rely on this behaviour. The values of the pointers you get within the functions are a pure implementation detail and are not guaranteed to be stable. The compiler is free to call the functions however it wishes, only promising you that the pointers you get will be valid for the duration of the call, and will have pointees initialised to the expected values (and if mutable, any changes you make to the pointees will be seen by the caller).
The only exception to this rule is the passing of pointers to global and static stored variables. Swift does guarantee that the pointer values you get will be stable and unique for that particular variable. From the Swift team's blog post on Interacting with C Pointers (emphasis mine):
However, interaction with C pointers is inherently
unsafe compared to your other Swift code, so care must be taken. In
particular:
These conversions cannot safely be used if the callee
saves the pointer value for use after it returns. The pointer that
results from these conversions is only guaranteed to be valid for the
duration of a call. Even if you pass the same variable, array, or
string as multiple pointer arguments, you could receive a different
pointer each time. An exception to this is global or static stored
variables. You can safely use the address of a global variable as a
persistent unique pointer value, e.g.: as a KVO context parameter.
Therefore if you made key a static stored property of A or just a global stored variable, you are guaranteed to the get same pointer value in both function calls.
Changing the function signature
When I change the function signature of bbb to make it the same with aaa, the result of two prints are the same
This appears to be an optimisation thing, as I can only reproduce it in -O builds and playgrounds. In an un-optimised build, the addition or removal of an extra parameter has no effect.
(Although it's worth noting that you should not test Swift behaviour in playgrounds as they are not real Swift environments, and can exhibit different runtime behaviour to code compiled with swiftc)
The cause of this behaviour is merely a coincidence – the second temporary variable is able to reside at the same address as the first (after the first is deallocated). When you add an extra parameter to aaa, a new variable will be allocated 'between' them to hold the value of the parameter to pass, preventing them from sharing the same address.
The same address isn't observable in un-optimised builds due to the intermediate load of a in order to call the getter for the value of a.key. As an optimisation, the compiler is able to inline the value of a.key to the call-site if it has a property initialiser with a constant expression, removing the need for this intermediate load.
Therefore if you give a.key a non-determininstic value, e.g var key = arc4random(), then you should once again observe different pointer values, as the value of a.key can no longer be inlined.
But regardless of the cause, this is a perfect example of how the pointer values for variables (which are not global or static stored variables) are not to be relied on – as the value you get can completely change depending on factors such as optimisation level and parameter count.
inout & UnsafeMutable(Raw)Pointer
Regarding your comment:
But since withUnsafePointer(to:_:) always has the correct behavior I want (in fact it should, otherwise this function is of no use), and it also has an inout parameter. So I assume there are implementation difference between these functions with inout parameters.
The compiler treats an inout parameter in a slightly different way to an UnsafeRawPointer parameter. This is because you can mutate the value of an inout argument in the function call, but you cannot mutate the pointee of an UnsafeRawPointer.
In order to make any mutations to the value of the inout argument visible to the caller, the compiler generally has two options:
Make a temporary variable initialised to the value returned by the variable's getter. Call the function with a pointer to this variable, and once the function has returned, call the variable's setter with the (possibly mutated) value of the temporary variable.
If it's addressable, simply call the function with a direct pointer to the variable.
As said above, the compiler cannot use the second option for stored properties that aren't known to be final (but this can change with optimisation). However, always relying on the first option can be potentially expensive for large values, as they'll have to be copied. This is especially detrimental for value types with copy-on-write behaviour, as they depend on being unique in order to perform direct mutations to their underlying buffer – a temporary copy violates this.
To solve this problem, Swift implements a special accessor – called materializeForSet. This accessor allows the callee to either provide the caller with a direct pointer to the given variable if it's addressable, or otherwise will return a pointer to a temporary buffer containing a copy of the variable, which will need to be written back to the setter after it has been used.
The former is the behaviour you're seeing with inout – you're getting a direct pointer to a.key back from materializeForSet, therefore the pointer values you get in both function calls are the same.
However, materializeForSet is only used for function parameters that require write-back, which explains why it's not used for UnsafeRawPointer. If you make the function parameters of aaa and bbb take UnsafeMutable(Raw)Pointers (which do require write-back), you should observe the same pointer values again.
func aaa(_ key: UnsafeMutableRawPointer) {
print(key)
}
func bbb(_ key: UnsafeMutableRawPointer) {
print(key)
}
class A {
var key = "aaa"
}
var a = A()
// will use materializeForSet to get a direct pointer to a.key
aaa(&a.key) // 0x0000000100b00580
bbb(&a.key) // 0x0000000100b00580
But again, as said above, this behaviour is not to be relied upon for variables that are not global or static.

Where is my variable being stored? (Swift)

One of my little experiments with Swift:
func store<T>(var x: T) -> (getter: (Void -> T), setter: (T -> Void)) {
return ({ x }, { x = $0 })
}
x is a value type.
My questions are:
Where exactly is x being stored (in terms of stack/heap)?
What are the pitfalls of storing x like this?
Is this safe?
When is x going to be destroyed (if ever)?

Parameters are passed to functions and methods by value - that means that a copy of the parameter is created and used in the function body.
Parameters received by functions and methods are immutable, which means their value cannot be changed. However the var modifier makes a parameter mutable - what's important to keep into account is that the copy of the parameter is mutable: the parameter passed to the function has no relationship with the parameter received by the function body, besides the initial copy. That said, making a parameter mutable via the var modifier makes it changeable, but its lifetime ends with the function body and does not affect the original parameter passed to the function.
There's another option, the inout modifier, which works like the var, but when the function returns the value is copied back into the variable passed in.
It's worth mentioning that so far I have implicitly taken value types only into account. If an instance of a reference type (class or closure) is passed to the function, as a var parameter, any change made through that parameter is actually done to the instance passed to the function (that's the most significant difference between value and reference types). The instance pointed by the x variable has the same lifetime of the parameter passed to the function.
All that said, in your case it works in a slightly different way. You are returning a closure (ok, they are 2, but that doesn't change the conclusion), and the closure captures x, which causes x to be kept alive for as long as the variable the closure is assigned to is in scope:
let x = 5
let (getter, setter) = store(x)
In the above code, when getter and setter will be deallocated, x (as the variable defined in the store function) will cease to exist too.
To answer to your questions:
x is a variable created when the store function is invoked. Since you are explicitly mentioning value types, then x should be allocated on the stack (as opposed to the heap, which should be used for reference types)
the pitfall is that it's deallocated when the 2 return values (which are reference types, being closure reference types) are deallocated
it could be useful in some niche cases, but generally I would stay away from it - note that it's my own opinion
already described above (when the function returns values are deallocated)

When does the copying take place for swift value types

In Swift, when you pass a value type, say an Array to a function. A copy of the array is made for the function to use.
However the documentation at https://developer.apple.com/library/ios/documentation/Swift/Conceptual/Swift_Programming_Language/ClassesAndStructures.html#//apple_ref/doc/uid/TP40014097-CH13-XID_134 also says:
The description above refers to the “copying” of strings, arrays, and
dictionaries. The behavior you see in your code will always be as if a
copy took place. However, Swift only performs an actual copy behind
the scenes when it is absolutely necessary to do so. Swift manages all
value copying to ensure optimal performance, and you should not avoid
assignment to try to preempt this optimization.
So does it mean that the copying actually only takes placed when the passed value type is modified?
Is there a way to demonstrate that this is actually the underlying behavior?
Why this is important? If I create a large immutable array and want to pass it in from function to function, I certainly do not want to keep making copies of it. Should I just use NSArrray in this case or would the Swift Array work fine as long as I do not try to manipulate the passed in Array?
Now as long as I do not explicitly make the variables in the function editable by using var or inout, then the function can not modify the array anyway. So does it still make a copy? Granted that another thread can modify the original array elsewhere (only if it is mutable), making a copy at the moment the function is called necessary (but only if the array passed in is mutable). So if the original array is immutable and the function is not using var or inout, there is no point in Swift creating a copy. Right? So what does Apple mean by the phrase above?

TL;DR:
So does it mean that the copying actually only takes placed when the passed value type is modified?
Yes!
Is there a way to demonstrate that this is actually the underlying behavior?
See the first example in the section on the copy-on-write optimization.
Should I just use NSArrray in this case or would the Swift Array work fine
as long as I do not try to manipulate the passed in Array?
If you pass your array as inout, then you'll have a pass-by-reference semantics,
hence obviously avoiding unnecessary copies.
If you pass your array as a normal parameter,
then the copy-on-write optimization will kick in and you shouldn't notice any performance drop
while still benefiting from more type safety that what you'd get with a NSArray.
Now as long as I do not explicitly make the variables in the function editable
by using var or inout, then the function can not modify the array anyway.
So does it still make a copy?
You will get a "copy", in the abstract sense.
In reality, the underlying storage will be shared, thanks to the copy-on-write mechanism,
hence avoiding unnecessary copies.
If the original array is immutable and the function is not using var or inout,
there is no point in Swift creating a copy. Right?
Exactly, hence the copy-on-write mechanism.
So what does Apple mean by the phrase above?
Essentially, Apple means that you shouldn't worry about the "cost" of copying value types,
as Swift optimizes it for you behind the scene.
Instead, you should just think about the semantics of value types,
which is that get a copy as soon as you assign or use them as parameters.
What's actually generated by Swift's compiler is the Swift's compiler business.
Value types semantics
Swift does indeed treat arrays as value types (as opposed to reference types),
along with structures, enumerations and most other built-in types
(i.e. those that are part of the standard library and not Foundation).
At the memory level, these types are actually immutable plain old data objects (POD),
which enables interesting optimizations.
Indeed, they are typically allocated on the stack rather than the heap [1],
(https://en.wikipedia.org/wiki/Stack-based_memory_allocation).
This allows the CPU to very efficiently manage them,
and to automatically deallocate their memory as soon as the function exits [2],
without the need for any garbage collection strategy.
Values are copied whenever assigned or passed as a function.
This semantics has various advantages,
such as avoiding the creation of unintended aliases,
but also as making it easier for the compiler to guarantee the lifetime of values
stored in a another object or captured by a closure.
We can think about how hard it can be to manage good old C pointers to understand why.
One may think it's an ill-conceived strategy,
as it involves copying every single time a variable is assigned or a function is called.
But as counterintuitive it may be,
copying small types is usually quite cheap if not cheaper than passing a reference.
After all, a pointer is usually the same size as an integer...
Concerns are however legitimate for large collections (i.e. arrays, sets and dictionaries),
and very large structures to a lesser extent [3].
But the compiler has has a trick to handle these, namely copy-on-write (see later).
What about mutating
Structures can define mutating methods,
which are allowed to mutate the fields of the structure.
This doesn't contradict the fact that value types are nothing more than immutable PODs,
as in fact calling a mutating method is merely a huge syntactic sugar
for reassigning a variable to a brand new value that's identical to the previous ones,
except for the fields that were mutated.
The following example illustrates this semantical equivalence:
struct S {
var foo: Int
var bar: Int
mutating func modify() {
foo = bar
}
}
var s1 = S(foo: 0, bar: 10)
s1.modify()
// The two lines above do the same as the two lines below:
var s2 = S(foo: 0, bar: 10)
s2 = S(foo: s2.bar, bar: s2.bar)
Reference types semantics
Unlike value types, reference types are essentially pointers to the heap at the memory level.
Their semantics is closer to what we would get in reference-based languages,
such as Java, Python or Javascript.
This means they do not get copied when assigned or passed to a function, their address is.
Because the CPU is no longer able to manage the memory of these objects automatically,
Swift uses a reference counter to handle garbage collection behind the scenes
(https://en.wikipedia.org/wiki/Reference_counting).
Such semantics has the obvious advantage to avoid copies,
as everything is assigned or passed by reference.
The drawback is the danger of unintended aliases,
as in almost any other reference-based language.
What about inout
An inout parameter is nothing more than a read-write pointer to the expected type.
In the case of value types, it means the function won't get a copy of the value,
but a pointer to such values,
so mutations inside the function will affect the value parameter (hence the inout keyword).
In other terms, this gives value types parameters a reference semantics in the context of the function:
func f(x: inout [Int]) {
x.append(12)
}
var a = [0]
f(x: &a)
// Prints '[0, 12]'
print(a)
In the case of reference types, it will make the reference itself mutable,
pretty much as if the passed argument was a the address of the address of the object:
func f(x: inout NSArray) {
x = [12]
}
var a: NSArray = [0]
f(x: &a)
// Prints '(12)'
print(a)
Copy-on-write
Copy-on-write (https://en.wikipedia.org/wiki/Copy-on-write) is an optimization technique that
can avoid unnecessary copies of mutable variables,
which is implemented on all Swift's built-in collections (i.e. array, sets and dictionaries).
When you assign an array (or pass it to a function),
Swift doesn't make a copy of the said array and actually uses a reference instead.
The copy will take place as soon as the your second array is mutated.
This behavior can be demonstrated with the following snippet (Swift 4.1):
let array1 = [1, 2, 3]
var array2 = array1
// Will print the same address twice.
array1.withUnsafeBytes { print($0.baseAddress!) }
array2.withUnsafeBytes { print($0.baseAddress!) }
array2[0] = 1
// Will print a different address.
array2.withUnsafeBytes { print($0.baseAddress!) }
Indeed, array2 doesn't get a copy of array1 immediately,
as shown by the fact it points to the same address.
Instead, the copy is triggered by the mutation of array2.
This optimization also happens deeper in the structure,
meaning that if for instance your collection is made of other collections,
the latter will also benefit from the copy-on-write mechanism,
as demonstrated by the following snippet (Swift 4.1):
var array1 = [[1, 2], [3, 4]]
var array2 = array1
// Will print the same address twice.
array1[1].withUnsafeBytes { print($0.baseAddress!) }
array2[1].withUnsafeBytes { print($0.baseAddress!) }
array2[0] = []
// Will print the same address as before.
array2[1].withUnsafeBytes { print($0.baseAddress!) }
Replicating copy-on-write
It is in fact rather easy to implement the copy-on-write mechanism in Swift,
as some of the its reference counter API is exposed to the user.
The trick consists of wrapping a reference (e.g. a class instance) within a structure,
and to check whether that reference is uniquely referenced before mutating it.
When that's the case, the wrapped value can be safely mutated,
otherwise it should be copied:
final class Wrapped<T> {
init(value: T) { self.value = value }
var value: T
}
struct CopyOnWrite<T> {
init(value: T) { self.wrapped = Wrapped(value: value) }
var wrapped: Wrapped<T>
var value: T {
get { return wrapped.value }
set {
if isKnownUniquelyReferenced(&wrapped) {
wrapped.value = newValue
} else {
wrapped = Wrapped(value: newValue)
}
}
}
}
var a = CopyOnWrite(value: SomeLargeObject())
// This line doesn't copy anything.
var b = a
However, there is an import caveat here!
Reading the documentation for isKnownUniquelyReferenced we get this warning:
If the instance passed as object is being accessed by multiple threads simultaneously,
this function may still return true.
Therefore, you must only call this function from mutating methods
with appropriate thread synchronization.
This means the implementation presented above isn't thread safe,
as we may encounter situations where it'd wrongly assumes the wrapped object can be safely mutated,
while in fact such mutation would break invariant in another thread.
Yet this doesn't mean Swift's copy-on-write is inherently flawed in multithreaded programs.
The key is to understand what "accessed by multiple threads simultaneously" really means.
In our example, this would happen if the same instance of CopyOnWrite was shared across multiple threads,
for instance as part of a shared global variable.
The wrapped object would then have a thread safe copy-on-write semantics,
but the instance holding it would be subject to data race.
The reason is that Swift must establish unique ownership
to properly evaluate isKnownUniquelyReferenced [4],
which it can't do if the owner of the instance is itself shared across multiple threads.
Value types and multithreading
It is Swift's intention to alleviate the burden of the programmer
when dealing with multithreaded environments, as stated on Apple's blog
(https://developer.apple.com/swift/blog/?id=10):
One of the primary reasons to choose value types over reference types
is the ability to more easily reason about your code.
If you always get a unique, copied instance,
you can trust that no other part of your app is changing the data under the covers.
This is especially helpful in multi-threaded environments
where a different thread could alter your data out from under you.
This can create nasty bugs that are extremely hard to debug.
Ultimately, the copy-on-write mechanism is a resource management optimization that,
like any other optimization technique,
one shouldn't think about when writing code [5].
Instead, one should think in more abstract terms
and consider values to be effectively copied when assigned or passed as arguments.
[1]
This holds only for values used as local variables.
Values used as fields of a reference type (e.g. a class) are also stored in the heap.
[2]
One could get confirmation of that by checking the LLVM byte code that's produced
when dealing with value types rather than reference types,
but the Swift compiler being very eager to perform constant propagation,
building a minimal example is a bit tricky.
[3]
Swift doesn't allow structures to reference themselves,
as the compiler would be unable to compute the size of such type statically.
Therefore, it is not very realistic to think of a structure that is so large
that copying it would become a legitimate concern.
[4]
This is, by the way, the reason why isKnownUniquelyReferenced accepts an inout parameter,
as it's currently Swift's way to establish ownership.
[5]
Although passing copies of value-type instances should be safe,
there's a open issue that suggests some problems with the current implementation
(https://bugs.swift.org/browse/SR-6543).

I don't know if that's the same for every value type in Swift, but for Arrays I'm pretty sure it's a copy-on-write, so it doesn't copy it unless you modify it, and as you said if you pass it around as a constant you don't run that risk anyway.
p.s. In Swift 1.2 there are new APIs you can use to implement copy-on-write on your own value-types too

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How does closure store references to constants or values - swift

Related

Does passing struct into a function also using copy-on-write for optimisation, or actual copy does happen immediately? [duplicate]

Cannot retrieve associated object [duplicate]

Why UnsafeRawPointer shows different result when function signatures differs in Swift?

Where is my variable being stored? (Swift)

When does the copying take place for swift value types

Categories

Resources