Why isn't a Swift tuple considered a collection type? - swift

In Swift, why isn't a tuple considered a collection type?
Of course this is fussbudget territory, but I find a certain amount of fussing helps me understand, retain, and organize what I'm learning.

I don't have inside knowledge of the motivations, but I can hopefully present some deeper understanding.
Type Safety
One of the primary goals of Swift is to enforce programming best practices to minimize human mistakes. One of these best practices is to ensure we always use specific types when we know them. In Swift, variables and constants always have explicit types and collections prefer to hold, say Strings, than AnyObjects if you know that it will be storing Strings.
Notably, every collection type in Swift may only hold one type of element. This includes Array, Dictionary, and Set.
Tuples are Compound Types
In Swift, there are two types of types: Named Types and Compound Types. Most types, including collection types, are named. Compound types, on the other hand, contain multiple named types in unique combinations. There are only two Compound Types: Function Types and Tuple Types. Tuple Types are different from collections in that:
Multiple Named Types can be bundled together in a Tuple Type without using AnyObject.
Each position can be given a textual label.
We can always anticipate how many values a tuple will be holding because we declared each position.
For more information, see the Types chapter of The Swift Programming Language.
Tuples Represent Objects
We typically don't think of a collection type as an object in itself. Instead, we think of it as a collection of objects. A tuple's labeled positions and multi-type functionality enable it to function more like an object than any collection. (Almost like a primitive Struct)
For example, you might consider an HTTP error to be a tuple with an error code (Int) and a description (String).
Tuples are often used as primitive approaches to temporary objects, such as returning multiple values from a function. They can be quickly dissected without indexing.
Additionally, each Tuple type has its own explicit type. For example, (Int, Int) is an entirely different type from (Int, Int, Int), which is an entirely different type from (Double, Double, Double).
For more on this, see "Tuples" in the The Basics chapter of The Swift Programming Language.
Tuples are Used in Fundamental Language Syntax
When we think of collection types, we think again of collections. Tuples, however, are used in fundamental places in the language that make Compound Type a more fitting title for them. For example, the Function Type is a Compound Tye that is used anytime you specify a function or closure. It is simply a tuple of parameters and a tuple of return values. Even Void is just a typealias for (), an empty tuple.
Additionally, tuples are in language syntax to temporarily bind values. For example, in a for-in loop you might use a tuple to iterate over a dictionary:
let dict = [1: "A", 2: "B", 3: "C"]
for (_, letter) in dict {
doSomething()
}
Here we use a tuple to iterate over only the values of the dictionary and ignore the keys.
We can do the same in Switch statements (excerpt from The Swift Programming Language):
let somePoint = (1, 1)
switch somePoint {
case (0, 0):
print("(0, 0) is at the origin")
case (_, 0):
print("(\(somePoint.0), 0) is on the x-axis")
case (0, _):
print("(0, \(somePoint.1)) is on the y-axis")
case (-2...2, -2...2):
print("(\(somePoint.0), \(somePoint.1)) is inside the box")
default:
print("(\(somePoint.0), \(somePoint.1)) is outside of the box")
}
// prints "(1, 1) is inside the box”
Because tuples are used in such fundamental places in language syntax, it wouldn't feel right for them to be bundled together with collection types.
SequenceType & CollectionType
One last thing to note is that a fundamental feature of CollectionType, which inherits much of this from SequenceType, is the ability to provide a safe Generator for iteration. For example, a for-in loop is possible because collections define a way to get the next item. Because tuples do not consist of the same types, and have a guaranteed number of items (after declaration), using them on the right of a for-in loop (instead of a traditional collection) would be less than intuitive.
If using it in a for-in loop, something designed to thrive with collections, seems less-than-intuitive for a tuple, than a tuple may deserve a different category.
For details on CollectionType, check out SwiftDoc.org. Note that a Generator providing iteration is required. Iterating a tuple would not be type safe and impose many unnecessary complexities, but making a tuple a collection is an interesting concept. It just might be too fundamental for that.

Related

Swift Disadvantages Array of Any

If I store a Tuple
var person = ("Steve", 22)
I cannot add more data easily into the structure.
However, if I use an array of Any
let steve: [Any] = ["Steve", 22]
I can easily add the elements.
Surely there are no real advantages to using a Tuple and we just always use an array of Any?
Any is weakly typed so you lose all of the static type checking guarantees that you get with strong types, including tuples. In addition, you cannot index out of bounds at run time on a tuple like you can with an array since the compiler knows how many components there are and it will fail to compile, and since a tuple is basically an anonymous struct, you can also name the components to make them more meaningful, which you cannot do with an array. You also pay a performance penalty for the array of Any since your data has to be boxed in the Any where as a tuple is just a struct and its components are not boxed.

Convert tuple to array in Scala

What is the best way to convert a tuple into an array in Scala? Here "best" means in as few lines of code as possible. I was shocked to search Google and StackOverflow only to find nothing on this topic, which seems like it should be trivial and common. Lists have a a toArray function; why don't tuples?
Use productIterator, immediately followed by toArray:
(42, 3.14, "hello", true).productIterator.toArray
gives:
res0: Array[Any] = Array(42, 3.14, hello, true)
The type of the result shows the main reason why it's rarely used: in tuples, the types of the elements can be heterogeneous, in arrays they must be homogeneous, so that often too much type information is lost during this conversion. If you want to do this, then you probably shouldn't have stored your information in tuples in the first place.
There is simply almost nothing you can (safely) do with an Array[Any], except printing it out, or converting it to an even more degenerate Set[Any]. Instead you could use:
Lists of case classes belonging to a common sealed trait,
shapeless HLists,
a carefully chosen base class with a bit of inheritance,
or something that at least keeps some kind of schema at runtime (like Apache Spark Datasets)
they would all be better alternatives.
In the somewhat less likely case that the elements of the "tuples" that you are processing frequently turn out to have an informative least upper bound type, then it might be because you aren't working with plain tuples, but with some kind of traversable data structure that puts restrictions on the number of substructures in the nodes. In this case, you should consider implementing something like Traverse interface for the structure, instead of messing with some "tuples" manually.

Tuple vs. Object in Swift

I have read about the differences in Tuples and Dictionaries / Arrays, but I have yet to come across a post on Stack Overflow explaining the difference between a Tuple and an Object in Swift.
The reason I ask is that from experience, it seems that a Tuple could be interchangeable with an Object in Swift in many circumstances (especially in those where the object only holds other objects or data), but could lead to inconsistent / messy code.
In Swift, is there a time to use a Tuple and a time to use a basic Object based on performance or coding methodologies?
As vadian notes, Apple's advice is that tuples only be used for temporary values. this plays out. If you need to do almost anything non-trivial with a data structure, including store it in a property, you probably do not want a tuple. They're very limited.
I'd avoid the term "object" in this discussion. That's a vague, descriptive term that doesn't cleanly map to any particular data structure. The correct way to think of a tuple is as being in contrast to a struct. In principle, a tuple is just an anonymous struct, but in Swift a tuple is dramatically less flexible than a struct. Most significantly, you cannot add extensions to a tuple, and adding extensions is a core part of Swift programming.
Basically, about the time you're thinking that you need to label the fields of the tuple, you probably should be using a struct instead. Types as simple as "a point" are modeled as structs, not tuples.
So when would you ever use a tuple? Consider the follow (non-existent) method on Collection:
extension Collection {
func headTail() -> (Element?, SubSequence) {
return (first, dropFirst())
}
}
This is a good use of a tuple. It would be unhelpful to invent a special struct just to return this value, and callers will almost always want to destructure this anyway like:
let (head, tail) = list.headTail()
This is one thing that tuples can do that structs cannot (at least today; there is ongoing discussion of adding struct destructuring and pattern matching to Swift).
In Swift, Tuple is a Compound Type that holds some properties together which are built up from Objects of Swift Named Types for example class, struct and enum.
I would analogize Objects of these Named Types as minerals of chemical elements ( like carbon, calcium) and Tuple is just a kind of physical mixture of these minerals( eg a pack of 1 part of calcium ore and 3 parts of carbon ore). You can easily carry around this packed tuple and add it to “heat or press” method to return “limestone” your app use in construction.

Ambiguous use of 'lazy'

I have no idea why this example is ambiguous. (My apologies for not adding the code here, it's simply too long.)
I have added prefix (_ maxLength) as an overload to LazyDropWhileBidirectionalCollection. subscript(position) is defined on LazyPrefixCollection. Yet, the following code from the above example shouldn't be ambiguous, yet it is:
print([0, 1, 2].lazy.drop(while: {_ in false}).prefix(2)[0]) // Ambiguous use of 'lazy'
It is my understanding that an overload that's higher up in the protocol hierarchy will get used.
According to the compiler it can't choose between two types; namely LazyRandomAccessCollection and LazySequence. (Which doesn't make sense since subscript(position) is not a method of LazySequence.) LazyRandomAccessCollection would be the logical choice here.
If I remove the subscript, it works:
print(Array([0, 1, 2].lazy.drop(while: {_ in false}).prefix(2))) // [0, 1]
What could be the issue?
The trail here is just too complicated and ambiguous. You can see this by dropping elements. In particular, drop the last subscript:
let z = [0, 1, 2].lazy.drop(while: {_ in false}).prefix(2)
In this configuration, the compiler wants to type z as LazyPrefixCollection<LazyDropWhileBidirectionalCollection<[Int]>>. But that isn't indexable by integers. I know it feels like it should be, but it isn't provable by the current compiler. (see below) So your [0] fails. And backtracking isn't powerful enough to get back out of this crazy maze. There are just too many overloads with different return types, and the compiler doesn't know which one you want.
But this particular case is trivially fixed:
print([0, 1, 2].lazy.drop(while: {_ in false}).prefix(2).first!)
That said, I would absolutely avoid pushing the compiler this hard. This is all too clever for Swift today. In particular overloads that return different types are very often a bad idea in Swift. When they're simple, yes, you can get away with it. But when you start layering them on, the compiler doesn't have a strong enough proof engine to resolve it. (That said, if we studied this long enough, I'm betting it actually is ambiguous somehow, but the diagnostic is misleading. That's a very common situation when you get into overly-clever Swift.)
Now that you describe it (in the comments), the reasoning is straightforward.
LazyDropWhileCollection can't have an integer index. Index subscripting is required to be O(1). That's the meaning of the Index subscript versus other subscripts. (The Index subscript must also return the Element type or crash; it can't return an Element?. That's way there's a DictionaryIndex that's separate from Key.)
Since the collection is lazy and has an arbitrary number of missing elements, looking up any particular integer "count" (first, second, etc.) is O(n). It's not possible to know what the 100th element is without walking through at least 100 elements. To be a collection, its O(1) index has to be in a form that can only be created by having previously walked the sequence. It can't be Int.
This is important because when you write code like:
for i in 1...1000 { print(xs[i]) }
you expect that to be on the order of 1000 "steps," but if this collection had an integer index, it would be on the order of 1 million steps. By wrapping the index, they prevent you from writing that code in the first place.
This is especially important in highly generic languages like Swift where layers of general-purpose algorithms can easily cascade an unexpected O(n) operation into completely unworkable performance (by "unworkable" I mean things that you expected to take milliseconds taking minutes or more).
Change the last row to this:
let x = [0, 1, 2]
let lazyX: LazySequence = x.lazy
let lazyX2: LazyRandomAccessCollection = x.lazy
let lazyX3: LazyBidirectionalCollection = x.lazy
let lazyX4: LazyCollection = x.lazy
print(lazyX.drop(while: {_ in false}).prefix(2)[0])
You can notice that the array has 4 different lazy conformations - you will have to be explicit.

Scala - encapsulating data in objects

Motivations
This question is about working with Lists of data in Scala, and about resorting to either tuples or class objects for holding data. Perhaps some of my assumptions are wrong, so there it goes.
My current approach
As I understand, tuples do not afford the possibility of elegantly addressing their elements beyond the provided ._1, ._2, etc. I can use them, but code will be a bit unpleasant wherever data is extracted far from the lines of code that had defined it.
Also, as I understand, a Scala Map can only use a single type declaration for its values, so it can't diversify the value type of its values except for the case of type inheritance. (to the later point, considering the use of a type hierarchy for Map values "diversity" - may seem to be very artificial unless a class hierarchy fits any "model" intuition to begin with).
So, when I need to have lists where each element contains two or more named data entities, e.g. as below one of type String and one of type List, each accessible through an intelligible name, I resort to:
case class Foo (name1: String, name2: List[String])
val foos: List[Foo] = ...
Then I can later access instances of the list using .name1 and .name2.
Shortcomings and problems I see here
When the list is very large, should I assume this is less performant or more memory consuming than using a tuple as the List's type? alternatively, is there a different elegant way of accomplishing struct semantics in Scala?
In terms of performance, I don't think there is going to be any distinction between a tuple and an instance of a cases class. In fact, a tuple is an instance of a case class.
Secondly, if you're looking for another, more readable way to get the data out of the tuple, I suggest you consider pattern matching:
val (name1, name2) = ("first", List("second", "third"))