Remove duplicate class instances based on object identity - swift

I'm writing a component for caching instances of classes. The classes are not per se Comparable, Hashable or Equatable. If they were, the semantics of the respective operations would not necessarily serve our purposes, so let's pretent we can not use those protocols.
Objects can be cached w.r.t. multiple keys. So when asking the cache for a list of all cached objects, I need to remove duplicates from the value set of the underlying dictionary -- with respect to object identity.
Obviously, this does the job:
var result: [C] = []
for c in dict.values {
if !result.contains(where: { (rc: C) in rc === c }) {
result.append(c)
}
}
return result
However, this has quadratic runtime behaviour. Compared to linearithmic or expected linear behaviour that are easy to get when using abovementioned protocols (using set implementations), this is bad.
So how can we efficiently remove duplicates w.r.t. object identity from a Swift collection?

We can wrap our objects into something that is Hashable and Comparable:
struct ClassWrap<T: AnyObject>: Hashable, Comparable {
var value: T
var hashValue: Int {
return ObjectIdentifier(self.value).hashValue
}
static func ==(lhs: ClassWrap, rhs: ClassWrap) -> Bool {
return lhs.value === rhs.value
}
static func <(lhs: ClassWrap<T>, rhs: ClassWrap<T>) -> Bool {
return ObjectIdentifier(lhs.value) < ObjectIdentifier(rhs.value)
}
}
Now, any regular Set implementation or otherwise unique-fying operation should do the job.

Related

Inserting a unique value in the set based on value's property

I've created a struct with an "id" property:
struct SomeStruct: Hashable {
let id: String; // should be unique
let date: String;
let comment: String;
static func ==(lhs: SomeStruct, rhs: SomeStruct) -> Bool {
return lhs.id == rhs.id
}
}
I need to insert these structs in a set, but the set itself should decide whether the new member is unique or not based on its id, so:
someSet.insert(SomeStruct(id: "1", date: "22.09.2022", comment: "nothing here")) //inserted: true
someSet.insert(SomeStruct(id: "1", date: "05.12.1978", comment: "something here!")) //inserted: false, not unique id
That is why I implement the equality operator func in the struct and sometimes it works... but sometimes it doesn't and here I am, humbly asking for your help.
My implementation gives me weird results. The structs with the same id's can both be inserted or not, for example if I build my playground file once and new value gets inserted, next time I build it and the same value returns false.
I may have missed something, maybe I should implement custom hashValue property..?
Thanks in advance.
P.S. My job is to create a collection of these structs, where each struct has unique id. If you did it before and/or you think you know how to do it better, please let me know, I will be very glad to hear your ideas.
A solution was given in comments, which was to hash only on id instead of relying on the compiler generated implementation, which hashes on all the properties. That solution is a perfectly valid one for the stated problem, and may be ideal for your use case.
It does have one potential drawback: It means SomeStruct is only ever hashably-distinct based on id. That is the case that's presented, but we don't see the wider code base (nor should we). Is SomeStruct only ever used so that having its hash value based solely on its id is the right thing? It definitely could be, and probably is, since most things with a unique ID work that way, but it doesn't have to be, and I don't like making that assumption about code I can't see.
A more code-base agnostic and reusable approach is rather than tweaking the data's implementation to make a given collection behave the way you want, you can make a collection with that desired behavior for the data you have. There's more code involved in this approach, but it also gives more flexibility in how you can use the data.
The described behavior is that of a Dictionary keyed on the id, but the API of Set. There are some options. Here's one.
First make SomeStruct conform to Identifiable which is a standard Swift protocol. All that means is that it needs a Hashable id property, which you already have so adding that conformance is just:
extension SomeStruct: Identifiable { }
Though you could add Identifiable to the definition of SomeStruct instead. Either way works.
Then create a data structure that is keyed on id. This doesn't have to be complicated. Just leverage the existing thing that does most of what you need to implement it. In this case, that's Dictionary. So here's a minimal, but generic, IDSet that does that:
struct IDSet<T: Identifiable>
{
public typealias Element = T
internal typealias Storage = [Element.ID: Element]
private var storage = Storage()
public mutating func insert(_ value: Element) -> Bool
{
guard !storage.keys.contains(value.id) else { return false }
storage[value.id] = value
return true
}
}
Because IDSet is generic, you can make one for any kind of Identifiable thing. It also doesn't need for the whole element to be Hashable. It only needs its id to be Hashable. That gives you more flexibility in the kinds of elements you store in it.
Presumably you'd want to iterate over the elements, so you'd need to make it conform to Sequence. Again there's no need to get complicated in making an Iterator, because you can leverage Dictionary's Iterator:
extension IDSet: Sequence
{
public struct Iterator: IteratorProtocol
{
internal var iter: Storage.Iterator
public mutating func next() -> Element?
{
guard let (_, value) = iter.next() else { return nil }
return value
}
}
public func makeIterator() -> Iterator {
Iterator(iter: storage.makeIterator())
}
}
You probably want IDSet to conform to some other protocols too, but I think it's likely you can see how that would work.
To use it, it's much like Set, although insert returns a non-discardable Bool indicating whether the element was inserted. Let's say you want to test this behavior in XCTest using your example data:
var someSet = IDSet<SomeStruct>()
XCTAssertTrue(someSet.insert(SomeStruct(id: "1", date: "22.09.2022", comment: "nothing here")))
XCTAssertFalse(someSet.insert(SomeStruct(id: "1", date: "05.12.1978", comment: "something here!")))
And of course, because it conforms to Sequence you can iterate over it:
for s in someSet {
print("id: \(s.id), date: \(s.date), comment: \"\(s.comment)\"")
}
Or use various other Sequence methods:
let comments = someSet.map { $0.comment }
Set elements are Hashable
Hashing a value means feeding its essential components into a hash function, represented by the Hasher type. Essential components are those that contribute to the type’s implementation of Equatable. Two instances that are equal must feed the same values to Hasher in hash(into:), in the same order.
I suggest adding:
struct SomeStruct: Hashable {
let id: String;
let date: String;
let comment: String;
static func ==(lhs: SomeStruct, rhs: SomeStruct) -> Bool {
return lhs.id == rhs.id
}
public func hash(into hasher: inout Hasher) {
// feed the only property that contribute to Equatable implementation
hasher.combine(id)
}
}

Using diff in an array of objects that conform to a protocol

I'm experimenting with using Composition instead of Inheritance and I wanted to use diff on an array of objects that comply with a given protocol.
To do so, I implemented a protocol and made it comply with Equatable:
// Playground - noun: a place where people can play
import XCPlayground
import Foundation
protocol Field:Equatable {
var content: String { get }
}
func ==<T: Field>(lhs: T, rhs: T) -> Bool {
return lhs.content == rhs.content
}
func ==<T: Field, U: Field>(lhs: T, rhs: U) -> Bool {
return lhs.content == rhs.content
}
struct First:Field {
let content:String
}
struct Second:Field {
let content:String
}
let items:[Field] = [First(content: "abc"), Second(content: "cxz")] // 💥 boom
But I've soon discovered that:
error: protocol 'Field' can only be used as a generic constraint because it has Self or associated type requirements
I understand why since Swift is a type-safe language that needs to be able to know the concrete type of these objects at anytime.
After tinkering around, I ended up removing Equatable from the protocol and overloading the == operator:
// Playground - noun: a place where people can play
import XCPlayground
import Foundation
protocol Field {
var content: String { get }
}
func ==(lhs: Field, rhs: Field) -> Bool {
return lhs.content == rhs.content
}
func ==(lhs: [Field], rhs: [Field]) -> Bool {
return (lhs.count == rhs.count) && (zip(lhs, rhs).map(==).reduce(true, { $0 && $1 })) // naive, but let's go with it for the sake of the argument
}
struct First:Field {
let content:String
}
struct Second:Field {
let content:String
}
// Requirement #1: direct object comparison
print(First(content: "abc") == First(content: "abc")) // true
print(First(content: "abc") == Second(content: "abc")) // false
// Requirement #2: being able to diff an array of objects complying with the Field protocol
let array1:[Field] = [First(content: "abc"), Second(content: "abc")]
let array2:[Field] = [Second(content: "abc")]
print(array1 == array2) // false
let outcome = array1.diff(array2) // 💥 boom
error: value of type '[Field]' has no member 'diff'
From here on, I'm a bit lost to be honest. I read some great posts about type erasure but even the provided examples suffered from the same issue (which I assume is the lack of conformance to Equatable).
Am I right? And if so, how can this be done?
UPDATE:
I had to stop this experiment for a while and totally forgot about a dependency, sorry! Diff is a method provided by SwiftLCS, an implementation of the longest common subsequence (LCS) algorithm.
TL;DR:
The Field protocol needs to comply with Equatable but so far I have not been able to do this. I need to be able to create an array of objects that comply to this protocol (see the error in the first code block).
Thanks again
The problem comes from a combination of the meaning of the Equatable protocol and Swift’s support for type overloaded functions.
Let’s take a look at the Equatable protocol:
protocol Equatable
{
static func ==(Self, Self) -> Bool
}
What does this mean? Well it’s important to understand what “equatable” actually means in the context of Swift. “Equatable” is a trait of a structure or class that make it so that any instance of that structure or class can be compared for equality with any other instance of that structure or class. It says nothing about comparing it for equality with an instance of a different class or structure.
Think about it. Int and String are both types that are Equatable. 13 == 13 and "meredith" == "meredith". But does 13 == "meredith"?
The Equatable protocol only cares about when both things to be compared are of the same type. It says nothing about what happens when the two things are of different types. That’s why both arguments in the definition of ==(::) are of type Self.
Let’s look at what happened in your example.
protocol Field:Equatable
{
var content:String { get }
}
func ==<T:Field>(lhs:T, rhs:T) -> Bool
{
return lhs.content == rhs.content
}
func ==<T:Field, U:Field>(lhs:T, rhs:U) -> Bool
{
return lhs.content == rhs.content
}
You provided two overloads for the == operator. But only the first one has to do with Equatable conformance. The second overload is the one that gets applied when you do
First(content: "abc") == Second(content: "abc")
which has nothing to do with the Equatable protocol.
Here’s a point of confusion. Equability across instances of the same type is a lower requirement than equability across instances of different types when we’re talking about individually bound instances of types you want to test for equality. (Since we can assume both things being tested are of the same type.)
However, when we make an array of things that conform to Equatable, this is a higher requirement than making an array of things that can be tested for equality, since what you are saying is that every item in the array can be compared as if they were both of the same type. But since your structs are of different types, you can’t guarantee this, and so the code fails to compile.
Here’s another way to think of it.
Protocols without associated type requirements, and protocols with associated type requirements are really two different animals. Protocols without Self basically look and behave like types. Protocols with Self are traits that types themselves conform to. In essence, they go “up a level”, like a type of type. (Related in concept to metatypes.)
That’s why it makes no sense to write something like this:
let array:[Equatable] = [5, "a", false]
You can write this:
let array:[Int] = [5, 6, 7]
or this:
let array:[String] = ["a", "b", "c"]
or this:
let array:[Bool] = [false, true, false]
Because Int, String, and Bool are types. Equatable isn’t a type, it’s a type of a type.
It would make “sense” to write something like this…
let array:[Equatable] = [Int.self, String.self, Bool.self]
though this is really stretching the bounds of type-safe programming and so Swift doesn’t allow this. You’d need a fully flexible metatyping system like Python’s to express an idea like that.
So how do we solve your problem? Well, first of all realize that the only reason it makes sense to apply SwiftLCS on your array is because, at some level, all of your array elements can be reduced to an array of keys that are all of the same Equatable type. In this case, it’s String, since you can get an array keys:[String] by doing [Field](...).map{ $0.content }. Perhaps if we redesigned SwiftLCS, this would make a better interface for it.
However, since we can only compare our array of Fields directly, we need to make sure they can all be upcast to the same type, and the way to do that is with inheritance.
class Field:Equatable
{
let content:String
static func == (lhs:Field, rhs:Field) -> Bool
{
return lhs.content == rhs.content
}
init(_ content:String)
{
self.content = content
}
}
class First:Field
{
init(content:String)
{
super.init(content)
}
}
class Second:Field
{
init(content:String)
{
super.init(content)
}
}
let items:[Field] = [First(content: "abc"), Second(content: "cxz")]
The array then upcasts them all to type Field which is Equatable.
By the way, ironically, the “protocol-oriented” solution to this problem actually still involves inheritance. The SwiftLCS API would provide a protocol like
protocol LCSElement
{
associatedtype Key:Equatable
var key:Key { get }
}
We would specialize it with a superclass
class Field:LCSElement
{
let key:String // <- this is what specializes Key to a concrete type
static func == (lhs:Field, rhs:Field) -> Bool
{
return lhs.key == rhs.key
}
init(_ key:String)
{
self.key = key
}
}
and the library would use it as
func LCS<T: LCSElement>(array:[T])
{
array[0].key == array[1].key
...
}
Protocols and Inheritance are not opposites or substitutes for one another. They complement each other.
I know this is probably now what you want but the only way I know how to make it work is to introduce additional wrapper class:
struct FieldEquatableWrapper: Equatable {
let wrapped: Field
public static func ==(lhs: FieldEquatableWrapper, rhs: FieldEquatableWrapper) -> Bool {
return lhs.wrapped.content == rhs.wrapped.content
}
public static func diff(_ coll: [Field], _ otherCollection: [Field]) -> Diff<Int> {
let w1 = coll.map({ FieldEquatableWrapper(wrapped: $0) })
let w2 = otherCollection.map({ FieldEquatableWrapper(wrapped: $0) })
return w1.diff(w2)
}
}
and then you can do
let outcome = FieldEquatableWrapper.diff(array1, array2)
I don't think you can make Field to conform to Equatable at all as it is designed to be "type-safe" using Self pseudo-class. And this is one reason for the wrapper class. Unfortunately there seems to be one more issue that I don't know how to fix: I can't put this "wrapped" diff into Collection or Array extension and still make it support heterogenous [Field] array without compilation error:
using 'Field' as a concrete type conforming to protocol 'Field' is not supported
If anyone knows a better solution, I'm interested as well.
P.S.
In the question you mention that
print(First(content: "abc") == Second(content: "abc")) // false
but I expect that to be true given the way you defined your == operator

Swift: different objects with same properties: hash value

Currently I have a class of generic type, and I want to make the object of this class searchable via
contains()
method for an array of those objects, by making the class conform to Hashable protocol and provide a hash value for each object. Now my problem is I have objects with exactly the same properties, and it seems that the array cannot really distinguish them (my current approach is to use one of the properties' hash value as the hash value for the class, and the
== <T> (lhs: ClassA<T>, rhs: ClassA<T>) -> Bool
function is done by comparing the hash value). I have tried to use a static property like "id", but for generic types static properties are not supported.
How should I define the hash value such that different objects with the same properties can still be differentiated?
EDIT: I'm making it conform to Hashable directly because it's also used as keys in dict in other parts of the program, since Hashable already conforms to Equatable.
My current approach is to use one of the properties' hash value as the
hash value for the class, and the
== <T> (lhs: ClassA<T>, rhs: ClassA<T>) -> Bool
function is done by comparing the hash value
That's not how the == and hashValue relationship works – don't do this. What if you get a hash collision? Two different instances with different properties could compare equal.
You should instead implement == to actually compare the properties of two instances. == should return true if two given instances have equivalent properties. The hashValues of two instances should be equivalent if they compare equal with ==.
Now, it might well be the case that you cannot do this comparison unless T is Equatable. One solution to this is to not conform ClassA to Equatable, but instead just overload == for when T is Equatable, such as:
func == <T : Equatable>(lhs: ClassA<T>, rhs: ClassA<T>) -> Bool {
// stub: do comparison logic
}
You can now just use Sequence's contains(where:) method in conjunction with the == overload in order to check if a given instance is in the array:
var array = [ClassA("foo")] // assuming ClassA has an init(_: T) and a suitable ==
// implementation to compare that value
let someInstanceToFind = ClassA("foo")
print(array.contains { $0 == someInstanceToFind }) // true
And if you want ClassA to have a hashValue, then simply write an extension that defines a hashValue when T is Hashable:
extension ClassA where T : Hashable {
var hashValue: Int {
return 0 // to do: implement hashValue logic
}
}
Unfortunately, this does mean that ClassA won't explicitly conform to Hashable when T does – but it will have a hashValue and == implementation. SE-0143: Conditional conformances will change this by allowing explicit conformance to protocols if a given where clause if satisfied, but this is yet to be implemented.
If you need explicit conformance to Hashable (such as for using instances of your class in a Set or as Dictionary keys) – then one solution is to create a wrapper type:
struct HashableClassA<T : Hashable> : Hashable {
var base: ClassA<T>
init(_ base: ClassA<T>) {
self.base = base
}
static func ==(lhs: HashableClassA, rhs: HashableClassA) -> Bool {
return lhs.base == rhs.base
}
var hashValue: Int {
return base.hashValue
}
}
Now you just have to wrap ClassA<T> instances in a HashableClassA instance before adding to a Set or Dictionary.
Just realized there is a simple way for achieving the Equatable in
contains()
method: use
return lhs === rhs
in the == function such that objects are compared directly. It's working in this way now.

Circular dependencies between generic types (CollectionType and its Index/Generator, e.g.)

Given a struct-based generic CollectionType …
struct MyCollection<Element>: CollectionType, MyProtocol {
typealias Index = MyIndex<MyCollection>
subscript(i: Index) -> Element { … }
func generate() -> IndexingGenerator<MyCollection> {
return IndexingGenerator(self)
}
}
… how would one define an Index for it …
struct MyIndex<Collection: MyProtocol>: BidirectionalIndexType {
func predecessor() -> MyIndex { … }
func successor() -> MyIndex { … }
}
… without introducing a dependency cycle of death?
The generic nature of MyIndex is necessary because:
It should work with any type of MyProtocol.
MyProtocol references Self and thus can only be used as a type constraint.
If there were forward declarations (à la Objective-C) I would just[sic!] add one for MyIndex<MyCollection> to my MyCollection<…>. Alas, there is no such thing.
A possible concrete use case would be binary trees, such as:
indirect enum BinaryTree<Element>: CollectionType, BinaryTreeType {
typealias Index = BinaryTreeIndex<BinaryTree>
case Nil
case Node(BinaryTree, Element, BinaryTree)
subscript(i: Index) -> Element { … }
}
Which would require a stack-based Index:
struct BinaryTreeIndex<BinaryTree: BinaryTreeType>: BidirectionalIndexType {
let stack: [BinaryTree]
func predecessor() -> BinaryTreeIndex { … }
func successor() -> BinaryTreeIndex { … }
}
One cannot (yet?) nest structs inside generic structs in Swift.
Otherwise I'd just move BinaryTreeIndex<…> inside BinaryTree<…>.
Also I'd prefer to have one generic BinaryTreeIndex,
which'd then work with any type of BinaryTreeType.
You cannot nest structs inside structs because they are value types. They aren’t pointers to an object, instead they hold their properties right there in the variable. Think about if a struct contained itself, what would its memory layout look like?
Forward declarations work in Objective-C because they are then used as pointers. This is why the indirect keyword was added to enums - it tells the compiler to add a level of indirection via a pointer.
In theory the same keyword could be added to structs, but it wouldn’t make much sense. You could do what indirect does by hand instead though, with a class box:
// turns any type T into a reference type
final class Box<T> {
let unbox: T
init(_ x: T) { unbox = x }
}
You could the use this to box up a struct to create, e.g., a linked list:
struct ListNode<T> {
var box: Box<(element: T, next: ListNode<T>)>?
func cons(x: T) -> ListNode<T> {
return ListNode(node: Box(element: x, next: self))
}
init() { box = nil }
init(node: Box<(element: T, next: ListNode<T>)>?)
{ box = node }
}
let nodes = ListNode().cons(1).cons(2).cons(3)
nodes.box?.unbox.element // first element
nodes.box?.unbox.next.box?.unbox.element // second element
You could turn this node directly into a collection, by conforming it to both ForwardIndexType and CollectionType, but this isn’t a good idea.
For example, they need very different implementations of ==:
the index needs to know if two indices from the same list are at the same position. It does not need the elements to conform to Equatable.
The collection needs to compare two different collections to see if they hold the same elements. It does need the elements to conform to Equatable i.e.:
func == <T where T: Equatable>(lhs: List<T>, rhs: List<T>) -> Bool {
// once the List conforms to at least SequenceType:
return lhs.elementsEqual(rhs)
}
Better to wrap it in two specific types. This is “free” – the wrappers have no overhead, just help you build the right behaviours more easily:
struct ListIndex<T>: ForwardIndexType {
let node: ListNode<T>
func successor() -> ListIndex<T> {
guard let next = node.box?.unbox.next
else { fatalError("attempt to advance past end") }
return ListIndex(node: next)
}
}
func == <T>(lhs: ListIndex<T>, rhs: ListIndex<T>) -> Bool {
switch (lhs.node.box, rhs.node.box) {
case (nil,nil): return true
case (_?,nil),(nil,_?): return false
case let (x,y): return x === y
}
}
struct List<T>: CollectionType {
typealias Index = ListIndex<T>
var startIndex: Index
var endIndex: Index { return ListIndex(node: ListNode()) }
subscript(idx: Index) -> T {
guard let element = idx.node.box?.unbox.element
else { fatalError("index out of bounds") }
return element
}
}
(no need to implement generate() – you get an indexing generator “for free” in 2.0 by implementing CollectionType)
You now have a fully functioning collection:
// in practice you would add methods to List such as
// conforming to ArrayLiteralConvertible or init from
// another sequence
let list = List(startIndex: ListIndex(node: nodes))
list.first // 3
for x in list { print(x) } // prints 3 2 1
Now all of this code looks pretty disgusting for two reasons.
One is because box gets in the way, and indirect is much better as the compiler sorts it all out for you under the hood. But it’s doing something similar.
The other is that structs are not a good solution to this. Enums are much better. In fact the code is really using an enum – that’s what Optional is. Only instead of nil (i.e. Optional.None), it would be better to have a End case for the end of the linked list. This is what we are using it for.
For more of this kind of stuff you could check out these posts.
While Airspeed Velocity's answer applies to the most common cases, my question was asking specifically about the special case of generalizing CollectionType indexing in order to be able to share a single Index implementation for all thinkable kinds of binary trees (whose recursive nature makes it necessary to make use of a stack for index-based traversals (at least for trees without a parent pointer)), which requires the Index to be specialized on the actual BinaryTree, not the Element.
The way I solved this problem was to rename MyCollection to MyCollectionStorage, revoke its CollectionType conformity and wrap it with a struct that now takes its place as MyCollection and deals with conforming to CollectionType.
To make things a bit more "real" I will refer to:
MyCollection<E> as SortedSet<E>
MyCollectionStorage<E> as BinaryTree<E>
MyIndex<T> as BinaryTreeIndex<T>
So without further ado:
struct SortedSet<Element>: CollectionType {
typealias Tree = BinaryTree<Element>
typealias Index = BinaryTreeIndex<Tree>
subscript(i: Index) -> Element { … }
func generate() -> IndexingGenerator<SortedSet> {
return IndexingGenerator(self)
}
}
struct BinaryTree<Element>: BinaryTreeType {
}
struct BinaryTreeIndex<BinaryTree: BinaryTreeType>: BidirectionalIndexType {
func predecessor() -> BinaryTreeIndex { … }
func successor() -> BinaryTreeIndex { … }
}
This way the dependency graph turns from a directed cyclic graph into a directed acyclic graph.

Check if a type implements a protocol

I am writing a library that creates extensions for default Swift types.
I would like to have a check on my Array extensions whether a certain type implements a certain protocol. See this method for example:
extension Array {
/// Compares the items using the given comparer and only returns non-equal values
/// :returns: the first items that are unique according to the comparer
func distinct(comparer: (T, T) -> Bool) -> [T] {
var result: [T] = []
outerLoop: for item in self {
for resultItem in result {
if comparer(item, resultItem) {
continue outerLoop
}
}
result.append(item)
}
return result
}
}
Now I'd like to rewrite this method to check if T is Equatable as such:
/// Compares the items using the given comparer and only returns non-equal values
/// :returns: the first items that are unique according to the comparer
func distinct(comparer: ((T, T) -> Bool)?) -> [T] {
var result: [T] = []
outerLoop: for item in self {
for resultItem in result {
if isEquatable ? comparer!(item, resultItem) : item == resultItem {
continue outerLoop
}
}
result.append(item)
}
return result
}
where isEquatable is a Bool value that tells me if T is Equatable. How can I find this out?
There isn’t a good way to do this in Swift at the moment.* This is why functions like sorted are either free-functions, or in the case of the member, take a predicate. The main problem with the test-and-cast approach you’re looking for is that Equatable and similar protocols have an associated type or rely on Self, and so can only be used inside a generic function as a constraint.
I’m guessing your goal is that the caller can skip supplying the comparator function, and so it will fall back to Equatable if available? And crash if it isn’t? The problem here is that the function is determining something at run time (the argument is Equatable) when this really ought to be determinable at compile time. This is not great - it’s much better to determine these things fully at compile time.
So you can write a free function that requires Equatable:
func distinct<C: CollectionType where C.Generator.Element: Equatable>
(source: C) -> [C.Generator.Element] {
var seen: [C.Generator.Element] = []
return filter(source) {
if contains(seen, $0) {
return false
}
else {
seen.append($0)
return true
}
}
}
let uniques = distinct([1,2,3,1,1,2]) // [1,2,3]
and then if you tried to call it with something that wasn’t comparable, you’d get a compile-time error:
let incomparable = [1,2,3] as [Any]
distinct(incomparable) // compiler barfs - Any isn’t Equatable
With the runtime approach, you’d only find this out when you ran the program.
The good news is, there are upsides too. The problem with searching an array for each element is the function will be very slow for large arrays, because for every element, the list of already-seen elements must be searched linearly. If you overload distinct with another version that requires the elements be Hashable (which Equatable things often are), you can use a set to track them:
func distinct<C: CollectionType where C.Generator.Element: Hashable>
(source: C) -> [C.Generator.Element] {
var seen: Set<C.Generator.Element> = []
return filter(source) {
if seen.contains($0) {
return false
}
else {
seen.insert($0)
return true
}
}
}
At compile time, the compiler will choose the best possible version of the function and use that. If your thing is hashable, that version gets picked, if it’s only equatable, it’ll use the slower one (this is because Hashable inherits from Equatable, and the compiler picks the more specialized function). Doing this at compile time instead of run time means you pay no penalty for the check, it’s all determined up front.
*there are ugly ways, but since the goal is appealing syntax, what’s the point… Perhaps the next version will allow constraints on methods, which would be nice.