Recursive Enumerations in Swift - swift

I'm learning Swift 2 (and C, but also not for long) for not too long and I came to a point where I struggle a lot with recursive enumerations.
It seems that I need to put indirect before the enum if it is recursive. Then I have the first case which has Int between the parentheses because later in the switch it returns an Integer, is that right?
Now comes the first problem with the second case Addition. There I have to put ArithmeticExpression between the parentheses. I tried putting Int there but it gave me an error that is has to be an ArithmeticExpression instead of an Int. My question is why? I can't imagine anything what that is about. Why can't I just put two Ints there?
The next problem is about ArithmeticExpression again. In the func solution it goes in an value called expression which is of the type ArithmeticExpression, is that correct? The rest is, at least for now, completely clear. If anyone could explain that to me in an easy way, that'd be great.
Here is the full code:
indirect enum ArithmeticExpression {
case Number(Int)
case Addition(ArithmeticExpression, ArithmeticExpression)
}
func solution(expression: ArithmeticExpression) -> Int {
switch expression {
case .Number(let value1):
return value1;
case . Addition(let value1, let value2):
return solution(value1)+solution(value2);
}
}
var ten = ArithmeticExpression.Number(10);
var twenty = ArithmeticExpression.Number(20);
var sum = ArithmeticExpression.Addition(ten, twenty);
var endSolution = solution(sum);
print(endSolution);

PeterPan, I sometimes think that examples that are TOO realistic confuse more than help as it’s easy to get bogged down in trying to understand the example code.
A recursive enum is just an enum with associated values that are cases of the enum's own type. That's it. Just an enum with cases that can be set to associated values of the same type as the enum. #endof
Why is this a problem? And why the key word "indirect" instead of say "recursive"? Why the need for any keyword at all?
Enums are "supposed" to be copied by value which means they should have case associated values that are of predictable size - made up of cases with the basic types like Integer and so on. The compiler can then guess the MAXIMUM possible size of a regular enum by the types of the raw or associated values with which it could be instantiated. After all you get an enum with only one of the cases selected - so whatever is the biggest option of the associated value types in the cases, that's the biggest size that enum type could get on initialisation. The compiler can then set aside that amount of memory on the stack and know that any initialisation or re-assignment of that enum instance could never be bigger than that. If the user sets the enum to a case with a small size associated value it is OK, and also if the user sets it to a case with the biggest associated value type.
However as soon as you define an enum which has a mixture of cases with different sized associated types, including values that are also enums of the same type (and so could themselves be initialised with any of the enums cases) it becomes impossible to guess the maximum size of the enum instance. The user could keep initialising with a case that allows an associated value that is the same type as the enum - itself initialised with a case that is also the same type, and so on and so on: an endless recursion or tree of possibilities. This recursion of enums pointing to enums will continue until an enum is initialised with associated value of "simple" type that does not point to another enum. Think of a simple Integer type that would “terminate” the chain of enums.
So the compiler cannot set aside the correct sized chunk of memory on the stack for this type of enum. Instead it treats the case associated values as POINTERS to the heap memory where the associated value is stored. That enum can itself point to another enum and so on. That is why the keyword is "indirect" - the associated value is referenced indirectly via a pointer and not directly by a value.
It is similar to passing an inout parameter to a function - instead of the compiler copying the value into the function, it passes a pointer to reference the original object in the heap memory.
So that's all there is to it. An enum that cannot easily have its maximum size guessed at because it can be initialised with enums of the same type and unpredictable sizes in chains of unpredictable lengths.
As the various examples illustrate, a typical use for such an enum is where you want to build-up trees of values like a formula with nested calculations within parentheses, or an ancestry tree with nodes and branches all captured in one enum on initialisation. The compiler copes with all this by using pointers to reference the associated value for the enum instead of a fixed chunk of memory on the stack.
So basically - if you can think of a situation in your code where you want to have chains of enums pointing to each other, with various options for associated values - then you will use, and understand, a recursive enum!

The reason the Addition case takes two ArithmeticExpressions instead of two Ints is so that it could handle recursive situations like this:
ArithmeticExpression.Addition(ArithmeticExpression.Addition(ArithmeticExpression.Number(1), ArithmeticExpression.Number(2)), ArithmeticExpression.Number(3))
or, on more than one line:
let addition1 = ArithmeticExpression.Addition(ArithmeticExpression.Number(1), ArithmeticExpression.Number(2))
let addition2 = ArithmeticExpression.Addition(addition1, ArithmeticExpression.Number(3))
which represents:
(1 + 2) + 3
The recursive definition allows you to add not just numbers, but also other arithmetic expressions. That's where the power of this enum lies: it can express multiple nested addition operations.

Related

When exactly do I need "indirect" with writing recursive enums?

I'm surprised to find out that this compiles:
enum Foo {
case one([Foo])
// or
// case one([String: Foo])
}
I would have expected the compiler to tell me to add indirect to the enum cases, as Foo contains more Foos. Arrays and Dictionarys are all value types, so there are no indirection here. To me, this is structurally similar to this:
enum Foo {
case one(Bar<Foo>)
}
// arrays and dictionaries are just a struct with either something or nothing in it
// right?
struct Bar<T> {
let t: T?
}
... which does require me to add indirect to the enum.
Then I tried tuples:
enum Foo {
case one((Foo, Foo))
}
and this is considered a recursive enum too. I thought arrays and tuples have the same memory layout, and that is why you can convert arrays to tuples using withUnsafeBytes and then binding the memory...
What rules does this follow? Are there actually something special about arrays and dictionaries? If so, are there other such "special" types that looks like it should require indirect, but actually doesn't?
The Swift Guide is not very helpful. It just says:
A recursive enumeration is an enumeration that has another instance of the enumeration as the associated value for one or more of the enumeration cases. You indicate that an enumeration case is recursive by writing indirect before it, which tells the compiler to insert the necessary layer of indirection.
What "has another instance of the enumeration" means is rather vague. It is even possible to interpret it to mean that case one(Bar<Foo>) is not recursive - the case only has an instance of Bar<Foo>, not Foo.
I think part of the confusion stems from this assumption:
I thought arrays and tuples have the same memory layout, and that is why you can convert arrays to tuples using withUnsafeBytes and then binding the memory...
Arrays and tuples don't have the same memory layout:
Array<T> is a fixed-size struct with a pointer to a buffer which holds the array elements contiguously* in memory
Contiguity is promised only in the case of native Swift arrays [not bridged from Objective-C]. NSArray instances do not guarantee that their underlying storage is contiguous, but in the end this does not have an effect on the code below.
Tuples are fixed-size buffers of elements held contiguously in memory
The key thing is that the size of an Array<T> does not change with the number of elements held (its size is simply the size of a pointer to the buffer), while a tuple does. The tuple is more equivalent to the buffer the array holds, and not the array itself.
Array<T>.withUnsafeBytes calls Array<T>.withUnsafeBufferPointer, which returns the pointer to the buffer, not to the array itself. *(In the case of a non-contiguous bridged NSArray, _ArrayBuffer.withUnsafeBufferPointer has to create a temporary contiguous copy of its contents in order to return a valid buffer pointer to you.)
When laying out memory for types, the compiler needs to know how large the type is. Given the above, an Array<Foo> is statically known to be fixed in size: the size of one pointer (to a buffer elsewhere in memory).
Given
enum Foo {
case one((Foo, Foo))
}
in order to lay out the size of Foo, you need to figure out the maximum size of all of its cases. It has only the single case, so it would be the size of that case.
Figuring out the size of one requires figuring out the size of its associated value, and the size of a tuple of elements is the sum of the size of the elements themselves (taking into account padding and alignment, but we don't really care about that here).
Thus, the size of Foo is the size of one, which is the size of (Foo, Foo) laid out in memory. So, what is the size of (Foo, Foo)? Well, it's the size of Foo + the size of Foo... each of which is the size of Foo + the size of Foo... each of which is the size of Foo + the size of Foo...
Where Array<Foo> had a way out (Array<T> is the same size regardless of T), we're stuck in an infinite loop with no base case.
indirect is the keyword required to break out of the recursion and give this infinite reference a base case. It inserts an implicit pointer by making a given case the fixed size of a pointer, regardless of what it contains or points to. That makes the size of one fixed, which allows Foo to have a fixed size.
indirect is less about Foo referring to Foo in any way, and more about allowing an enum case to potentially contain itself indirectly (because direct containment would lead to an infinite loop).
As an aside, this is also why a struct cannot contain a direct instance of itself:
struct Foo {
let foo: Foo // error: Value type 'Foo' cannot have a stored property that recursively contains it
}
would lead to infinite recursion, while
struct Foo {
let foo: UnsafePointer<Foo>
}
is fine.
structs don't support the indirect keyword (at least in a struct, you have more direct control over storage and layout), but there have been pitches for adding support for this on the Swift forums.
Indirect on enum cases can be used to tell the compiler to store the associated value of that enum as a pointer. Reference.
This is necessary when the compiler couldn't figure out the memory layout of the associated value without using a pointer, such as in the case of recursive enums.
The reason why you need to make the below case indirect is because by using a generic struct that holds your enum as an associated value on a case of the enum itself, you end up with a recursive enum, since you can nest Foo and Bar in each other infinitely many times.
enum MaybeRecursive {
case array([MaybeRecursive])
case dict([String: MaybeRecursive])
indirect case genericStruct(GenericStruct<MaybeRecursive>)
}
struct GenericStruct<T> {
let t: T?
}
// if you replaced `.array` with `.genericStruct` again, you could keep going forever
let associatedValue = GenericStruct(t: MaybeRecursive.genericStruct(.init(t: .array([]))))
MaybeRecursive.genericStruct(associatedValue)
As for the Array vs Tuple difference, Itai Ferber has expertly explained that part in their own answer.

How does `NSAttibutedString` equate attribute values of type `Any`?

The enumerateAttribute(_:in:options:using:) method of NSAttributedString appears to equate arbitrary instances of type Any. Of course, Any does not conform to Equatable, so that should not be possible.
Question: How does the method compare one instance of Any to another?
Context: In Swift, I am subclassing NSTextStorage, and have need to provide my own implementation of this method in Swift.
Observations:
NSAttributedString attributes come in key-value pairs, with the keys being instances of type NSAttributedString.Key and the values being instances of type Any?, with each pair being associated with one or more ranges of characters in the string. At least, that is how the data structure appears from the outside; the internal implementation is opaque.
The enumerateAttribute method walks through the entire range of an NSAttributedString, effectively identifying each different value corresponding to a specified key, and with the ranges over which that value applies.
The values corresponding to a given key could be of multiple different types.
NSAttributedString seemingly has no way of knowing what underlying types the Any values might be, and thus seemingly no way of type casting in order to make a comparison of two given Any values.
Yet, the method somehow is differentiating among ranges of the string based on differences in the Any values.
Interestingly, the method is able to differentiate between values even when the underlying type does not conform to Equatable. I take this to be a clue that the method may be using some sort of reflection to perform the comparison.
Even more interesting, the method goes so far as to differentiate between values when the underlying type does conform to Equatable and the difference between two values is a difference that the specific implementation of Equatable intentionally ignores. In other words, even if a == b returns true, if there is a difference in opaque properties of a and b that are ignored by ==, the method will treat the values as being different, not the same.
I assume the method bridges to an implementation in ObjC.
Is the answer: It cannot be done in Swift?
As you know, Cocoa is Objective-C, so these are Objective-C NSDictionary objects, not Swift Dictionary objects. So equality comparison between them uses Objective-C isEqual, not Swift ==. We are not bound by Swift strict typing, the Equatable protocol, or anything else from Swift.
To illustrate, here's a slow and stupid but effective implementation of style run detection:
let s = NSMutableAttributedString(
string: "howdy", attributes: [.foregroundColor:UIColor.red])
s.addAttributes([.foregroundColor:UIColor.blue],
range: NSRange(location: 2, length: 1))
var lastatt = s.attributes(at: 0, effectiveRange: nil)
for ix in 1..<5 {
let newatt = s.attributes(at:ix, effectiveRange:nil)
if !(newatt as NSDictionary).isEqual(to: lastatt) {
print("style run ended at \(ix)")
lastatt = newatt
}
}
That correctly prints:
style run ended at 2
style run ended at 3
So since it is always possible to compare the attributes at any index with the attributes at another, it is possible to implement attribute enumeration in Swift. (Whether that's a good idea is another question.)

indirect enums and structs

To start off, I want to say that I'm aware there are many articles and questions within SO that refer to the indirect keyword in Swift.
The most popular explanation for the usage of indirect is to allow for recursive enums.
Rather than just knowing about what indirect allows us to do, I would like to know how it allows us to use recursive enums.
Questions:
Is it because enums are value types and value types do not scale well if they are built in a recursive structure? Why?
Does indirect modify the value type behaviour to behave more like a reference type?
The following two examples compile just fine. What is the difference?
indirect enum BinaryTree<T> {
case node(BinaryTree<T>, T, BinaryTree<T>)
case empty
}
enum BinaryTree<T> {
indirect case node(BinaryTree<T>, T, BinaryTree<T>)
case empty
}
The indirect keyword introduces a layer of indirection behind the scenes.
You indicate that an enumeration case is recursive by writing indirect before it, which tells the compiler to insert the necessary layer of indirection.
From here
The important part of structs and enums is that they're a constant size. Allowing recursive structs or enums directly would violate this, as there would be an indeterminable number of recursions, hence making the size non constant and unpredictable. indirect uses a constant size reference to refer to a constant size enum instance.
There's a different between the two code snippets you show.
The first piece of code makes BinaryTree<T> stored by a reference everywhere it's used.
The second piece of code makes BinaryTree<T> stored by a reference only in the case of node. I.e. BinaryTree<T> generally has its value stored directly, except for this explicitly indirect node case.
Swift indirect enum
Since Swift v2.0
Swift Enum[About] is a value type[About], and we assign it the value is copied that is why the size of type should be calculated at compile time.
Problem with associated value
enum MyEnum { //Recursive enum <enum_name> is not marked
case case1(MyEnum) `indirect`
}
it is not possible to calculate the final size because of recursion
Indirect says to compiler to store the associated value indirectly - by reference(instead of value)
indirect enum - is stored as reference for all cases
indirect case - is stored as reference only for this case
Also indirect is not applied for other value types(struct)
You can use indirect enum. It's not exactly struct, but it is also a value type. I don't think struct has similar indirect keyword support.
From Hacking with Swift post:
Indirect enums are enums that need to reference themselves somehow, and are called “indirect” because they modify the way Swift stores them so they can grow to any size. Without the indirection, any enum that referenced itself could potentially become infinitely sized: it could contain itself again and again, which wouldn’t be possible.
As an example, here’s an indirect enum that defines a node in a linked list:
indirect enum LinkedListItem<T> {
case endPoint(value: T)
case linkNode(value: T, next: LinkedListItem)
}
Because that references itself – because one of the associated values is itself a linked list item – we need to mark the enum as being indirect.

When does the copying take place for swift value types

In Swift, when you pass a value type, say an Array to a function. A copy of the array is made for the function to use.
However the documentation at https://developer.apple.com/library/ios/documentation/Swift/Conceptual/Swift_Programming_Language/ClassesAndStructures.html#//apple_ref/doc/uid/TP40014097-CH13-XID_134 also says:
The description above refers to the “copying” of strings, arrays, and
dictionaries. The behavior you see in your code will always be as if a
copy took place. However, Swift only performs an actual copy behind
the scenes when it is absolutely necessary to do so. Swift manages all
value copying to ensure optimal performance, and you should not avoid
assignment to try to preempt this optimization.
So does it mean that the copying actually only takes placed when the passed value type is modified?
Is there a way to demonstrate that this is actually the underlying behavior?
Why this is important? If I create a large immutable array and want to pass it in from function to function, I certainly do not want to keep making copies of it. Should I just use NSArrray in this case or would the Swift Array work fine as long as I do not try to manipulate the passed in Array?
Now as long as I do not explicitly make the variables in the function editable by using var or inout, then the function can not modify the array anyway. So does it still make a copy? Granted that another thread can modify the original array elsewhere (only if it is mutable), making a copy at the moment the function is called necessary (but only if the array passed in is mutable). So if the original array is immutable and the function is not using var or inout, there is no point in Swift creating a copy. Right? So what does Apple mean by the phrase above?
TL;DR:
So does it mean that the copying actually only takes placed when the passed value type is modified?
Yes!
Is there a way to demonstrate that this is actually the underlying behavior?
See the first example in the section on the copy-on-write optimization.
Should I just use NSArrray in this case or would the Swift Array work fine
as long as I do not try to manipulate the passed in Array?
If you pass your array as inout, then you'll have a pass-by-reference semantics,
hence obviously avoiding unnecessary copies.
If you pass your array as a normal parameter,
then the copy-on-write optimization will kick in and you shouldn't notice any performance drop
while still benefiting from more type safety that what you'd get with a NSArray.
Now as long as I do not explicitly make the variables in the function editable
by using var or inout, then the function can not modify the array anyway.
So does it still make a copy?
You will get a "copy", in the abstract sense.
In reality, the underlying storage will be shared, thanks to the copy-on-write mechanism,
hence avoiding unnecessary copies.
If the original array is immutable and the function is not using var or inout,
there is no point in Swift creating a copy. Right?
Exactly, hence the copy-on-write mechanism.
So what does Apple mean by the phrase above?
Essentially, Apple means that you shouldn't worry about the "cost" of copying value types,
as Swift optimizes it for you behind the scene.
Instead, you should just think about the semantics of value types,
which is that get a copy as soon as you assign or use them as parameters.
What's actually generated by Swift's compiler is the Swift's compiler business.
Value types semantics
Swift does indeed treat arrays as value types (as opposed to reference types),
along with structures, enumerations and most other built-in types
(i.e. those that are part of the standard library and not Foundation).
At the memory level, these types are actually immutable plain old data objects (POD),
which enables interesting optimizations.
Indeed, they are typically allocated on the stack rather than the heap [1],
(https://en.wikipedia.org/wiki/Stack-based_memory_allocation).
This allows the CPU to very efficiently manage them,
and to automatically deallocate their memory as soon as the function exits [2],
without the need for any garbage collection strategy.
Values are copied whenever assigned or passed as a function.
This semantics has various advantages,
such as avoiding the creation of unintended aliases,
but also as making it easier for the compiler to guarantee the lifetime of values
stored in a another object or captured by a closure.
We can think about how hard it can be to manage good old C pointers to understand why.
One may think it's an ill-conceived strategy,
as it involves copying every single time a variable is assigned or a function is called.
But as counterintuitive it may be,
copying small types is usually quite cheap if not cheaper than passing a reference.
After all, a pointer is usually the same size as an integer...
Concerns are however legitimate for large collections (i.e. arrays, sets and dictionaries),
and very large structures to a lesser extent [3].
But the compiler has has a trick to handle these, namely copy-on-write (see later).
What about mutating
Structures can define mutating methods,
which are allowed to mutate the fields of the structure.
This doesn't contradict the fact that value types are nothing more than immutable PODs,
as in fact calling a mutating method is merely a huge syntactic sugar
for reassigning a variable to a brand new value that's identical to the previous ones,
except for the fields that were mutated.
The following example illustrates this semantical equivalence:
struct S {
var foo: Int
var bar: Int
mutating func modify() {
foo = bar
}
}
var s1 = S(foo: 0, bar: 10)
s1.modify()
// The two lines above do the same as the two lines below:
var s2 = S(foo: 0, bar: 10)
s2 = S(foo: s2.bar, bar: s2.bar)
Reference types semantics
Unlike value types, reference types are essentially pointers to the heap at the memory level.
Their semantics is closer to what we would get in reference-based languages,
such as Java, Python or Javascript.
This means they do not get copied when assigned or passed to a function, their address is.
Because the CPU is no longer able to manage the memory of these objects automatically,
Swift uses a reference counter to handle garbage collection behind the scenes
(https://en.wikipedia.org/wiki/Reference_counting).
Such semantics has the obvious advantage to avoid copies,
as everything is assigned or passed by reference.
The drawback is the danger of unintended aliases,
as in almost any other reference-based language.
What about inout
An inout parameter is nothing more than a read-write pointer to the expected type.
In the case of value types, it means the function won't get a copy of the value,
but a pointer to such values,
so mutations inside the function will affect the value parameter (hence the inout keyword).
In other terms, this gives value types parameters a reference semantics in the context of the function:
func f(x: inout [Int]) {
x.append(12)
}
var a = [0]
f(x: &a)
// Prints '[0, 12]'
print(a)
In the case of reference types, it will make the reference itself mutable,
pretty much as if the passed argument was a the address of the address of the object:
func f(x: inout NSArray) {
x = [12]
}
var a: NSArray = [0]
f(x: &a)
// Prints '(12)'
print(a)
Copy-on-write
Copy-on-write (https://en.wikipedia.org/wiki/Copy-on-write) is an optimization technique that
can avoid unnecessary copies of mutable variables,
which is implemented on all Swift's built-in collections (i.e. array, sets and dictionaries).
When you assign an array (or pass it to a function),
Swift doesn't make a copy of the said array and actually uses a reference instead.
The copy will take place as soon as the your second array is mutated.
This behavior can be demonstrated with the following snippet (Swift 4.1):
let array1 = [1, 2, 3]
var array2 = array1
// Will print the same address twice.
array1.withUnsafeBytes { print($0.baseAddress!) }
array2.withUnsafeBytes { print($0.baseAddress!) }
array2[0] = 1
// Will print a different address.
array2.withUnsafeBytes { print($0.baseAddress!) }
Indeed, array2 doesn't get a copy of array1 immediately,
as shown by the fact it points to the same address.
Instead, the copy is triggered by the mutation of array2.
This optimization also happens deeper in the structure,
meaning that if for instance your collection is made of other collections,
the latter will also benefit from the copy-on-write mechanism,
as demonstrated by the following snippet (Swift 4.1):
var array1 = [[1, 2], [3, 4]]
var array2 = array1
// Will print the same address twice.
array1[1].withUnsafeBytes { print($0.baseAddress!) }
array2[1].withUnsafeBytes { print($0.baseAddress!) }
array2[0] = []
// Will print the same address as before.
array2[1].withUnsafeBytes { print($0.baseAddress!) }
Replicating copy-on-write
It is in fact rather easy to implement the copy-on-write mechanism in Swift,
as some of the its reference counter API is exposed to the user.
The trick consists of wrapping a reference (e.g. a class instance) within a structure,
and to check whether that reference is uniquely referenced before mutating it.
When that's the case, the wrapped value can be safely mutated,
otherwise it should be copied:
final class Wrapped<T> {
init(value: T) { self.value = value }
var value: T
}
struct CopyOnWrite<T> {
init(value: T) { self.wrapped = Wrapped(value: value) }
var wrapped: Wrapped<T>
var value: T {
get { return wrapped.value }
set {
if isKnownUniquelyReferenced(&wrapped) {
wrapped.value = newValue
} else {
wrapped = Wrapped(value: newValue)
}
}
}
}
var a = CopyOnWrite(value: SomeLargeObject())
// This line doesn't copy anything.
var b = a
However, there is an import caveat here!
Reading the documentation for isKnownUniquelyReferenced we get this warning:
If the instance passed as object is being accessed by multiple threads simultaneously,
this function may still return true.
Therefore, you must only call this function from mutating methods
with appropriate thread synchronization.
This means the implementation presented above isn't thread safe,
as we may encounter situations where it'd wrongly assumes the wrapped object can be safely mutated,
while in fact such mutation would break invariant in another thread.
Yet this doesn't mean Swift's copy-on-write is inherently flawed in multithreaded programs.
The key is to understand what "accessed by multiple threads simultaneously" really means.
In our example, this would happen if the same instance of CopyOnWrite was shared across multiple threads,
for instance as part of a shared global variable.
The wrapped object would then have a thread safe copy-on-write semantics,
but the instance holding it would be subject to data race.
The reason is that Swift must establish unique ownership
to properly evaluate isKnownUniquelyReferenced [4],
which it can't do if the owner of the instance is itself shared across multiple threads.
Value types and multithreading
It is Swift's intention to alleviate the burden of the programmer
when dealing with multithreaded environments, as stated on Apple's blog
(https://developer.apple.com/swift/blog/?id=10):
One of the primary reasons to choose value types over reference types
is the ability to more easily reason about your code.
If you always get a unique, copied instance,
you can trust that no other part of your app is changing the data under the covers.
This is especially helpful in multi-threaded environments
where a different thread could alter your data out from under you.
This can create nasty bugs that are extremely hard to debug.
Ultimately, the copy-on-write mechanism is a resource management optimization that,
like any other optimization technique,
one shouldn't think about when writing code [5].
Instead, one should think in more abstract terms
and consider values to be effectively copied when assigned or passed as arguments.
[1]
This holds only for values used as local variables.
Values used as fields of a reference type (e.g. a class) are also stored in the heap.
[2]
One could get confirmation of that by checking the LLVM byte code that's produced
when dealing with value types rather than reference types,
but the Swift compiler being very eager to perform constant propagation,
building a minimal example is a bit tricky.
[3]
Swift doesn't allow structures to reference themselves,
as the compiler would be unable to compute the size of such type statically.
Therefore, it is not very realistic to think of a structure that is so large
that copying it would become a legitimate concern.
[4]
This is, by the way, the reason why isKnownUniquelyReferenced accepts an inout parameter,
as it's currently Swift's way to establish ownership.
[5]
Although passing copies of value-type instances should be safe,
there's a open issue that suggests some problems with the current implementation
(https://bugs.swift.org/browse/SR-6543).
I don't know if that's the same for every value type in Swift, but for Arrays I'm pretty sure it's a copy-on-write, so it doesn't copy it unless you modify it, and as you said if you pass it around as a constant you don't run that risk anyway.
p.s. In Swift 1.2 there are new APIs you can use to implement copy-on-write on your own value-types too

Strange behaviour for recursive enum in Swift (Beta 7)

enum Tree{
case Leaf(String)
case Node(Tree)
} //compiler not happy!!
enum Tree{
case Leaf(String)
case Node([Tree])
} //compiler is happy in (arguably) a more complex recursive scenario?
How can the Swift compiler work for the second (more complex) scenario and not the first?
It is worth noting that Swift 2 beta 2 and further has indirect keyword for recursive enum - that means
enum Tree<T> {
case Leaf(T)
indirect case Node(Tree)
}
is valid language construct that doesn't break pattern matching in Swift 2.
TL;DR of the decision: "[…] we decided that the right solution is to simply not support general, non-obvious recursion through enums, and require the programmer to mediate that explicitly with indirect."
A value type (an enum) cannot contain itself as a direct member, since not matter how big a data structure is, it cannot contain itself. Apparently associated data of enum cases are considered direct members of the enum, so the associated data cannot be the type of the enum itself. (Actually, I wish that they would make recursive enums work; it would be so great for functional data structures.)
However, if you have a level of indirection, it is okay. For example, the associated data can be an object (instance of a class), and that class can have a member that is the enum. Since class types are reference types, it is just a pointer and does not directly contain the object (and thus the enum), so it is fine.
The answer to your question is: [Tree] does not contain Tree directly as a member. The fields of Array are private, but we can generally infer that the storage for the elements of the array are not stored in the Array struct directly, because this struct has a fixed size for a given Array<T>, but the array can have unlimited number of elements.
Chris Lattner (designer of Swift) says on the Apple Developer forums that autoclosure
has emerged as a way to "box" expression value in a reference (e.g.
working around limitations with recursive enums).
However, the following code (which works in Swift 1.1) does not work in Swift 1.2 that comes with Xcode Beta Version 6.3 (6D520o). The error message is "Attributes can only be applied to declarations, not types", however if this is intended, I don't know how to reconcile it with Lattner's statement about the behaviour he talks about in the previous quote as being "a useful thing, and we haven't removed it with Swift 1.2."
enum BinaryTree {
case Leaf(String)
case Node(#autoclosure () -> BinaryTree, #autoclosure () -> BinaryTree)
}
let l1 = BinaryTree.Leaf("A")
let l2 = BinaryTree.Leaf("B")
let l3 = BinaryTree.Leaf("C")
let l4 = BinaryTree.Leaf("D")
let n1 = BinaryTree.Node(l1, l2)
let n2 = BinaryTree.Node(l3, l4)
let t = BinaryTree.Node(n1, n2)