Array values optional, or not? - swift

Could you explain why:
when I access an array value using array.first it's optional
when I access from an index value it is not?
Example:
var players = ["Alice", "Bob", "Cindy", "Dan"]
let firstPlayer = players.first
print(firstPlayer) // Optional("Alice")
let firstIndex = players[0]
print(firstIndex) // Alice

(The short answers to this question are great, and exactly what you need. I just wanted to go a bit deeper into the why and how this interacts with Swift Collections more generally and the underlying types. If you just want "how should I use this stuff?" read the accepted answer and ignore all this.)
Arrays follow the rules of all Collections. A Collection must implement the following subscript:
subscript(position: Self.Index) -> Self.Element { get }
So to be a Collection, Array's subscript must accept its Index and unconditionally return an Element. For many kinds of Collections, it is impossible to create an Index that does not exist, but Array uses Int as its Index, so it has to deal with the possibility that you pass an Index that is out of range. In that case, it is impossible to return an Element, and its only option is to fail to return at all. This generally takes the form of crashing the program since it's generally more useful than hanging the program, which is the other option.
(This hides a slight bit of type theory, which is that every function in Swift technically can return "crash," but we don't track that in the type system. It's possible to do that to distinguish between functions that can crash and ones that cannot, but Swift doesn't.)
This should naturally raise the question of why Dictionary doesn't crash when you subscript with a non-existant key. The reason is that Dictionary's Index is not its Key. It has a little-used subscript that provides conformance to Collection (little-used in top-level code, but very commonly used inside of stdlib):
subscript(position: Dictionary<Key, Value>.Index) -> Dictionary.Element { get }
Array could have done this as well, having an Array.Index type that was independent of Int, and making the Int subscript return an Optional. In Swift 1.0, I opened a radar to request exactly that. The team argued that this would make common uses of Array too difficult and that programmers coming to Swift were used to the idea that out-of-range was a programming error (crash). Dictionary, on the other hand, is common to access with non-existant keys, so the Key subscript should be Optional. Several years using Swift has convinced me they were right.
In general you shouldn't subscript arrays unless you got the index from the array (i.e. using index(where:)). But many Cocoa patterns make it very natural to subscript (cellForRow(at:) being the most famous). Still, in more pure Swift code, subscripting with arbitrary Ints often suggests a design problem.
Instead you should often use Collection methods like first and first(where:) which return Optionals and generally safer and clearer, and iterate over them using for-in loops rather than subscripts.

if you want to use subscript and you don't want to have a crash, you can add this extension to your code:
extension Collection {
subscript (safe index: Index) -> Iterator.Element? {
return indices.contains(index) ? self[index] : nil
}
}
and then use it:
let array = [0, 1, 2]
let second = array[safe:1] //Optional(1)
let fourth = array[safe:3] //nil instead of crash

The behavior of first and index subscription is different:
first is declared safely: If the array is empty it returns nil, otherwise the (optional) object.
index subscription is unsafe for legacy reasons: If the array is empty it throws an out-of-range exception otherwise it returns the (non-optional) object

This is because with first, if the Array is empty, the value will be nil. That is why it is an optional. If it is not empty, the first element will be returned.
However, with a subscript (or index value), your program will crash with an error
fatal error: Index out of range
If it is out of range (or is empty) and not return an optional. Else, it will return the element required.

There are default behavior of array property. Array is generic type of Element. When you try to access using first it return as optional.
public var first: Element? { get }
This is available in Array class.

Related

What did Swift do to avoid that Collection was mutated while being enumerated? [duplicate]

I'm experimenting with iteration on an array using a for .. in .. loop. My question is related to the case where the collection is changed within the loop body.
It seems that the iteration is safe, even if the list shrinks in the meantime. The for iteration variables successively take the values of the (indexes and) elements that were in the array at the start of the loop, despite the changes made on the flow. Example:
var slist = [ "AA", "BC", "DE", "FG" ]
for (i, st) in slist.enumerated() { // for st in slist gives a similar result
print ("Index \(i): \(st)")
if st == "AA" { // at one iteration change completely the list
print (" --> check 0: \(slist[0]), and 2: \(slist[2])")
slist.append ("KLM")
slist.insert(st+"XX", at:0) // shift the elements in the array
slist[2]="bc" // replace some elements to come
print (" --> check again 0: \(slist[0]), and 2: \(slist[2])")
slist.remove(at:3)
slist.remove(at:3)
slist.remove(at:1) // makes list shorter
}
}
print (slist)
This works very well, the iteration being made on the values [ "AA", "BC", "DE", "FG" ] even if after the first iteration the array is completely changed to ["AAXX", "bc", "KLM"]
I wanted to know if I can safely rely on this behavior. Unfortunately, the language guide does not tell anything about iterating on a collection when the collection is modified. And the for .. in section doesn't address this question either. So:
Can I safely rely on a guarantee about this iteration behavior provided in the language specifications ?
Or am I simply lucky with the current version of Swift 5.4? In this case, is there any clue in the language specification that one cannot take it for granted? And is there a performance overhead for this iteration behavior (e.g. some copy) compared to indexed iteration?
The documentation for IteratorProtocol says "whenever you use a for-in loop with an array, set, or any other collection or sequence, you’re using that type’s iterator." So, we are guaranteed that a for in loop is going to be using .makeIterator() and .next() which is defined most generally on Sequence and IteratorProtocol respectively.
The documentation for Sequence says that "the Sequence protocol makes no requirement on conforming types regarding whether they will be destructively consumed by iteration." As a consequence, this means that an iterator for a Sequence is not required to make a copy, and so I do not think that modifying a sequence while iterating over it is, in general, safe.
This same caveat does not occur in the documentation for Collection, but I also don't think there is any guarantee that the iterator makes a copy, and so I do not think that modifying a collection while iterating over it is, in general, safe.
But, most collection types in Swift are structs with value semantics or copy-on-write semantics. I'm not really sure where the documentation for this is, but this link does say that "in Swift, Array, String, and Dictionary are all value types... You don’t need to do anything special — such as making an explicit copy — to prevent other code from modifying that data behind your back." In particular, this means that for Array, .makeIterator() cannot hold a reference to your array because the iterator for Array does not have to "do anything special" to prevent other code (i.e. your code) from modifying the data it holds.
We can explore this in more detail. The Iterator type of Array is defined as type IndexingIterator<Array<Element>>. The documentation IndexingIterator says that it is the default implementation of the iterator for collections, so we can assume that most collections will use this. We can see in the source code for IndexingIterator that it holds a copy of its collection
#frozen
public struct IndexingIterator<Elements: Collection> {
#usableFromInline
internal let _elements: Elements
#usableFromInline
internal var _position: Elements.Index
#inlinable
#inline(__always)
/// Creates an iterator over the given collection.
public /// #testable
init(_elements: Elements) {
self._elements = _elements
self._position = _elements.startIndex
}
...
}
and that the default .makeIterator() simply creates this copy.
extension Collection where Iterator == IndexingIterator<Self> {
/// Returns an iterator over the elements of the collection.
#inlinable // trivial-implementation
#inline(__always)
public __consuming func makeIterator() -> IndexingIterator<Self> {
return IndexingIterator(_elements: self)
}
}
Although you might not want to trust this source code, the documentation for library evolution claims that "the #inlinable attribute is a promise from the library developer that the current definition of a function will remain correct when used with future versions of the library" and the #frozen also means that the members of IndexingIterator cannot change.
Altogether, this means that any collection type with value semantics and an IndexingIterator as its Iterator must make a copy when using using for in loops (at least until the next ABI break, which should be a long-way off). Even then, I don't think Apple is likely to change this behavior.
In Conclusion
I don't know of any place that it is explicitly spelled out in the docs "you can modify an array while you iterate over it, and the iteration will proceed as if you made a copy" but that's also the kind of language that probably shouldn't be written down as writing such code could definitely confuse a beginner.
However, there is enough documentation lying around which says that a for in loop just calls .makeIterator() and that for any collection with value semantics and the default iterator type (for example, Array), .makeIterator() makes a copy and so cannot be influenced by code inside the loop. Further, because Array and some other types like Set and Dictionary are copy-on-write, modifying these collections inside a loop will have a one-time copy penalty as the body of the loop will not have a unique reference to its storage (because the iterator will). This is the exact same penalty that modifying the collection outside the loop with have if you don’t have a unique reference to the storage.
Without these assumptions, you aren't guaranteed safety, but you might have it anyway in some circumstances.
Edit:
I just realized we can create some cases where this is unsafe for sequences.
import Foundation
/// This is clearly fine and works as expected.
print("Test normal")
for _ in 0...10 {
let x: NSMutableArray = [0,1,2,3]
for i in x {
print(i)
}
}
/// This is also okay. Reassigning `x` does not mutate the reference that the iterator holds.
print("Test reassignment")
for _ in 0...10 {
var x: NSMutableArray = [0,1,2,3]
for i in x {
x = []
print(i)
}
}
/// This crashes. The iterator assumes that the last index it used is still valid, but after removing the objects, there are no valid indices.
print("Test removal")
for _ in 0...10 {
let x: NSMutableArray = [0,1,2,3]
for i in x {
x.removeAllObjects()
print(i)
}
}
/// This also crashes. `.enumerated()` gets a reference to `x` which it expects will not be modified behind its back.
print("Test removal enumerated")
for _ in 0...10 {
let x: NSMutableArray = [0,1,2,3]
for i in x.enumerated() {
x.removeAllObjects()
print(i)
}
}
The fact that this is an NSMutableArray is important because this type has reference semantics. Since NSMutableArray conforms to Sequence, we know that mutating a sequence while iterating over it is not safe, even when using .enumerated().
The slist.enumerate() create a new instance of EnumeratedSequence<[String]>
To create an instance of EnumeratedSequence, call enumerated() on a sequence or collection. The following example enumerates the elements of an array. reference
If you remove the .enumerate() produce the same result, any st has the old value. This occurs because the for-in loop generates a new instance of IndexingIterator<[String]>.
Whenever you use a for-in loop with an array, set, or any other collection or sequence, you’re using that type’s iterator. Swift uses a sequence’s or collection’s iterator internally to enable the for-in loop language construct. reference
About the questions:
You would be able to remove all the elements and still perform the loop safe because a new instance is generated to perform the interactions.
Swift uses the iterator internally to enable for-in then there's no overhead to compare. Logically that the larger the array the performance will be affected.

Iterating with for .. in on a changing collection

I'm experimenting with iteration on an array using a for .. in .. loop. My question is related to the case where the collection is changed within the loop body.
It seems that the iteration is safe, even if the list shrinks in the meantime. The for iteration variables successively take the values of the (indexes and) elements that were in the array at the start of the loop, despite the changes made on the flow. Example:
var slist = [ "AA", "BC", "DE", "FG" ]
for (i, st) in slist.enumerated() { // for st in slist gives a similar result
print ("Index \(i): \(st)")
if st == "AA" { // at one iteration change completely the list
print (" --> check 0: \(slist[0]), and 2: \(slist[2])")
slist.append ("KLM")
slist.insert(st+"XX", at:0) // shift the elements in the array
slist[2]="bc" // replace some elements to come
print (" --> check again 0: \(slist[0]), and 2: \(slist[2])")
slist.remove(at:3)
slist.remove(at:3)
slist.remove(at:1) // makes list shorter
}
}
print (slist)
This works very well, the iteration being made on the values [ "AA", "BC", "DE", "FG" ] even if after the first iteration the array is completely changed to ["AAXX", "bc", "KLM"]
I wanted to know if I can safely rely on this behavior. Unfortunately, the language guide does not tell anything about iterating on a collection when the collection is modified. And the for .. in section doesn't address this question either. So:
Can I safely rely on a guarantee about this iteration behavior provided in the language specifications ?
Or am I simply lucky with the current version of Swift 5.4? In this case, is there any clue in the language specification that one cannot take it for granted? And is there a performance overhead for this iteration behavior (e.g. some copy) compared to indexed iteration?
The documentation for IteratorProtocol says "whenever you use a for-in loop with an array, set, or any other collection or sequence, you’re using that type’s iterator." So, we are guaranteed that a for in loop is going to be using .makeIterator() and .next() which is defined most generally on Sequence and IteratorProtocol respectively.
The documentation for Sequence says that "the Sequence protocol makes no requirement on conforming types regarding whether they will be destructively consumed by iteration." As a consequence, this means that an iterator for a Sequence is not required to make a copy, and so I do not think that modifying a sequence while iterating over it is, in general, safe.
This same caveat does not occur in the documentation for Collection, but I also don't think there is any guarantee that the iterator makes a copy, and so I do not think that modifying a collection while iterating over it is, in general, safe.
But, most collection types in Swift are structs with value semantics or copy-on-write semantics. I'm not really sure where the documentation for this is, but this link does say that "in Swift, Array, String, and Dictionary are all value types... You don’t need to do anything special — such as making an explicit copy — to prevent other code from modifying that data behind your back." In particular, this means that for Array, .makeIterator() cannot hold a reference to your array because the iterator for Array does not have to "do anything special" to prevent other code (i.e. your code) from modifying the data it holds.
We can explore this in more detail. The Iterator type of Array is defined as type IndexingIterator<Array<Element>>. The documentation IndexingIterator says that it is the default implementation of the iterator for collections, so we can assume that most collections will use this. We can see in the source code for IndexingIterator that it holds a copy of its collection
#frozen
public struct IndexingIterator<Elements: Collection> {
#usableFromInline
internal let _elements: Elements
#usableFromInline
internal var _position: Elements.Index
#inlinable
#inline(__always)
/// Creates an iterator over the given collection.
public /// #testable
init(_elements: Elements) {
self._elements = _elements
self._position = _elements.startIndex
}
...
}
and that the default .makeIterator() simply creates this copy.
extension Collection where Iterator == IndexingIterator<Self> {
/// Returns an iterator over the elements of the collection.
#inlinable // trivial-implementation
#inline(__always)
public __consuming func makeIterator() -> IndexingIterator<Self> {
return IndexingIterator(_elements: self)
}
}
Although you might not want to trust this source code, the documentation for library evolution claims that "the #inlinable attribute is a promise from the library developer that the current definition of a function will remain correct when used with future versions of the library" and the #frozen also means that the members of IndexingIterator cannot change.
Altogether, this means that any collection type with value semantics and an IndexingIterator as its Iterator must make a copy when using using for in loops (at least until the next ABI break, which should be a long-way off). Even then, I don't think Apple is likely to change this behavior.
In Conclusion
I don't know of any place that it is explicitly spelled out in the docs "you can modify an array while you iterate over it, and the iteration will proceed as if you made a copy" but that's also the kind of language that probably shouldn't be written down as writing such code could definitely confuse a beginner.
However, there is enough documentation lying around which says that a for in loop just calls .makeIterator() and that for any collection with value semantics and the default iterator type (for example, Array), .makeIterator() makes a copy and so cannot be influenced by code inside the loop. Further, because Array and some other types like Set and Dictionary are copy-on-write, modifying these collections inside a loop will have a one-time copy penalty as the body of the loop will not have a unique reference to its storage (because the iterator will). This is the exact same penalty that modifying the collection outside the loop with have if you don’t have a unique reference to the storage.
Without these assumptions, you aren't guaranteed safety, but you might have it anyway in some circumstances.
Edit:
I just realized we can create some cases where this is unsafe for sequences.
import Foundation
/// This is clearly fine and works as expected.
print("Test normal")
for _ in 0...10 {
let x: NSMutableArray = [0,1,2,3]
for i in x {
print(i)
}
}
/// This is also okay. Reassigning `x` does not mutate the reference that the iterator holds.
print("Test reassignment")
for _ in 0...10 {
var x: NSMutableArray = [0,1,2,3]
for i in x {
x = []
print(i)
}
}
/// This crashes. The iterator assumes that the last index it used is still valid, but after removing the objects, there are no valid indices.
print("Test removal")
for _ in 0...10 {
let x: NSMutableArray = [0,1,2,3]
for i in x {
x.removeAllObjects()
print(i)
}
}
/// This also crashes. `.enumerated()` gets a reference to `x` which it expects will not be modified behind its back.
print("Test removal enumerated")
for _ in 0...10 {
let x: NSMutableArray = [0,1,2,3]
for i in x.enumerated() {
x.removeAllObjects()
print(i)
}
}
The fact that this is an NSMutableArray is important because this type has reference semantics. Since NSMutableArray conforms to Sequence, we know that mutating a sequence while iterating over it is not safe, even when using .enumerated().
The slist.enumerate() create a new instance of EnumeratedSequence<[String]>
To create an instance of EnumeratedSequence, call enumerated() on a sequence or collection. The following example enumerates the elements of an array. reference
If you remove the .enumerate() produce the same result, any st has the old value. This occurs because the for-in loop generates a new instance of IndexingIterator<[String]>.
Whenever you use a for-in loop with an array, set, or any other collection or sequence, you’re using that type’s iterator. Swift uses a sequence’s or collection’s iterator internally to enable the for-in loop language construct. reference
About the questions:
You would be able to remove all the elements and still perform the loop safe because a new instance is generated to perform the interactions.
Swift uses the iterator internally to enable for-in then there's no overhead to compare. Logically that the larger the array the performance will be affected.

Sense behind the empty subscript?

Why does this even compile? What is the need for an empty subscript which obviously behaves like a function without parameters?
extension Array {
subscript() -> Int {
return 0
}
}
let array = [1,3,2]
print(array[]) // "0"
Note that it can also be used for an assignment, so it behaves like a computed property named [].
Why does this even compile
It compiles because you defined an empty-subscript extension to Array:
extension Array {
subscript() -> Int {
return 0
}
}
Array already has a subscript defined, whereby you supply an index number and get back the element at that index. This extension adds another subscript, whereby you supply nothing and get back the number zero.
Without that extension, this would not compile:
let array = [1,3,2]
print(array[])
What is the need for an empty subscript which obviously behaves like a function without parameters
There's no "need"; it's a convenience. You could, after all, make exactly the same "objection" to subscripts in general! They do nothing that you cannot accomplish by methods. In fact, such methods exist; the subscript notation is merely a pleasant piece of syntactic sugar.

Why does not Dictionary adopt MutableCollectionType protocol?

While implementing a custom collection type (and therefore making it to adhere to CollectionType protocol) I came to wonder why MutableCollectionType is not adopted by Dictionary type?
From the documentation for MutableCollectionType:
A collection that supports subscript assignment.
For any instance a of a type conforming to MutableCollectionType, :
a[i] = x
let y = a[i]
is equivalent to:
a[i] = x
let y = x
Therefore, it would seem "logical" that Dictionary also adopts this protocol. However, after checking out header files as well as docs, it seems that only Array and related types do that.
What's so special about MutableCollectionType, or about Dictionary, or both for that matter? Should my dictionary-like custom collection type also avoid adopting MutableCollectionType for some reason?
A glance through the protocol reference describes it as having methods like sort and partition. It also has an internal type call SubSequence. These are meaningless with dictionaries. There are no order within a dictionary.
From the headers:
Whereas an arbitrary sequence may be consumed as it is traversed, a collection is multi-pass: any element may be revisited merely by saving its index.
That makes no sense for a dictionary, as a dictionary is unordered. Just because the entry keyed by "howdy" is at index 2 right now does not mean it will be at index 2 one minute from now. In particular, it makes no sense to say "insert this key at index 2" - it is the keys and the internal hashing that provide the order. The indexes have no persistent life of their own. Thus, it is a collection (it has indexes), but not a mutable collection (you can't write into it by index).
To understand the declaration of MutableCollectionType protocol, you first need to know a concept called subscript.
When you write “let y = dic[key]”, Swift is calling a method called subscript getter:
subscript (key: Key) -> Value? { get }
And when you write “dic[key] = x”, Swift is calling a method called subscript setter:
subscript (key: Key) -> Value? { set }
Now let's look at the MutableCollectionType protocol. Dictionary does not conform to MutableCollectionType. because the required methods of this protocol is not implemented in Dictionary.
One of the required method is
public subscript (position: Self.Index) -> Self.Generator.Element { get set }
This subscript method is not the same as the above two we use every day. The type of position is Self.Index, which is DictionaryIndex<Key, Value> for Dictionary type. And the return type Self.Generator.Element is (Key, Value). I think This index type DictionaryIndex is something related to the hash table implementation, which can be used to directly refer to an hash table element. When you use the setter of the subscript you will write something like
dic[index] = (key, value)
It certainly makes no sense to replace a hash map element with another key value pair. This subscript setter is never implemented by Dictionary, so it does not conform to MutableCollectionType protocol.

find() using Functional Programming

I'd like to create a generic find() typically used in functional programming. In functional programming you don't work with array indices and for loops. You filter. The way it works is that if you have a list of say
["apple", "banana", "cherry"]
and you want to find banana then you assign the array indices to the list elements by creating tuples
[(1, "apple"), (2, "banana"), (3, "cherry")]
Now you can filter down to "banana" and return the index value.
I was trying to create a generic function for this but I get an error. What's wrong with this syntax?
func findInGenericIndexedList<T>(indexedList: [(index: Int, value: T)], element: (index: Int, value: T)) -> Int? {
let found = indexedList.filter { // ERROR: Cannot invoke 'filter' with an argument list of type '((_) -> _)'
element.value === $0.value
}
if let definiteFound = found.first {
return definiteFound.index
}
return nil
}
UPDATE 1: I would like to use the above solution as opposed to using find() (will be deprecated) or in Swift 2.0 indexOf() because I'm trying to follow the Functional Programming paradigm, relying on general functions and not class methods.
The minimum change required to make this work would be to make T conform to Equatable and use the == operator.
func findInGenericIndexedList<T:Equatable>(indexedList: [(index: Int, value: T)], element: (index: Int, value: T)) -> Int? {
let found = indexedList.filter {
element.value == $0.value
}
if let definiteFound = found.first {
return definiteFound.index
}
return nil
}
It doesn't really make sense to use === here because are usually going to be applying this to value types (especially if you are following functional paradigms) for which this is never true.
Beyond this I spent some time thinking about the problem and here is what I would do:
extension Array where Element : Equatable {
func find(element:Array.Generator.Element) -> Int? {
let indexedList = lazy(self.enumerate())
let found = indexedList.filter {
element == $1
}
let definiteFound = found.prefix(1)
return definiteFound.generate().next()?.index
}
}
Protocol extension on Array because it makes the syntax neater, lazy sequence to avoid checking every element, 0 indexed.
Here are a couple of thoughts.
I would still prefer to use the identity operator '===', because in my array I may have multiple items of the same value.
The identity operator === only works for reference types, like classes. It will never work for value types, like Strings or Ints, or structs, etc. You might take a look at the difference between value types and reference types, especially if you are interested in functional programming, which eschews reference types almost completely. When you are working with value types, there is only equality (==) - there is no identity. Two instances of the String "bananas" will never refer to the same identical object. They will always refer to two different Strings, though their values might be equal.
I want to delete the exact item that I passed to the function. Is it impossible to do this?
If you are working with value types, like Strings, then yes, it is impossible. There is no such thing as two different Strings that are the exact same item. Two Strings are always are always different objects, for the reasons stated above.
Note that if you work only with classes, and not value types, then you could use the === operator, but this would defeat much of what you are trying to do.
What this boils down to is that if you have an array of (index, value) tuples that looks like this:
[(0, "bananas"), (1, "apples"), (2, "oranges"), (3, "bananas")]
And you write a function that looks for tuples where the value is "bananas", you have a couple of choices. You can filter it and look for the first tuple in the array that has the value "bananas" and return the index of that tuple. In the above case it would return 0. Or, you could return all of the indexes in the form of an array, like this: [0, 3]. Or I suppose you could return some other arbitrary subset of the results, like the last index, or the first-and-last indexes, etc., but those all seem a little silly. The Swift standard library opts for returning the index of the first item that matches the search criteria for precisely this reason. None of the other options make a whole lot of sense.
But putting it back into the context of your question, none of the tuples that you find with the value of "bananas" are going to be the exact (identical) instance of "bananas" that you passed in to your search function. No two value types are ever identical. They may be equal, but they are never identical.
One more note - mainly for clarification of what you are even trying to do. In your first attempt at writing this function, you appear to already know the index of the item you are searching for. You pass it in to the function as a parameter, right here:
// ----------vvvvv
func findInGenericIndexedList<T>(indexedList: [(index: Int, value: T)], element: (index: Int, value: T)) -> Int?
Just out of curiosity, is this a typo? Or do you actually know the index of the tuple that you are searching for? Because if you already know what it is, well... you don't need to search for it :)