Inserting a unique value in the set based on value's property - swift

I've created a struct with an "id" property:
struct SomeStruct: Hashable {
let id: String; // should be unique
let date: String;
let comment: String;
static func ==(lhs: SomeStruct, rhs: SomeStruct) -> Bool {
return lhs.id == rhs.id
}
}
I need to insert these structs in a set, but the set itself should decide whether the new member is unique or not based on its id, so:
someSet.insert(SomeStruct(id: "1", date: "22.09.2022", comment: "nothing here")) //inserted: true
someSet.insert(SomeStruct(id: "1", date: "05.12.1978", comment: "something here!")) //inserted: false, not unique id
That is why I implement the equality operator func in the struct and sometimes it works... but sometimes it doesn't and here I am, humbly asking for your help.
My implementation gives me weird results. The structs with the same id's can both be inserted or not, for example if I build my playground file once and new value gets inserted, next time I build it and the same value returns false.
I may have missed something, maybe I should implement custom hashValue property..?
Thanks in advance.
P.S. My job is to create a collection of these structs, where each struct has unique id. If you did it before and/or you think you know how to do it better, please let me know, I will be very glad to hear your ideas.

A solution was given in comments, which was to hash only on id instead of relying on the compiler generated implementation, which hashes on all the properties. That solution is a perfectly valid one for the stated problem, and may be ideal for your use case.
It does have one potential drawback: It means SomeStruct is only ever hashably-distinct based on id. That is the case that's presented, but we don't see the wider code base (nor should we). Is SomeStruct only ever used so that having its hash value based solely on its id is the right thing? It definitely could be, and probably is, since most things with a unique ID work that way, but it doesn't have to be, and I don't like making that assumption about code I can't see.
A more code-base agnostic and reusable approach is rather than tweaking the data's implementation to make a given collection behave the way you want, you can make a collection with that desired behavior for the data you have. There's more code involved in this approach, but it also gives more flexibility in how you can use the data.
The described behavior is that of a Dictionary keyed on the id, but the API of Set. There are some options. Here's one.
First make SomeStruct conform to Identifiable which is a standard Swift protocol. All that means is that it needs a Hashable id property, which you already have so adding that conformance is just:
extension SomeStruct: Identifiable { }
Though you could add Identifiable to the definition of SomeStruct instead. Either way works.
Then create a data structure that is keyed on id. This doesn't have to be complicated. Just leverage the existing thing that does most of what you need to implement it. In this case, that's Dictionary. So here's a minimal, but generic, IDSet that does that:
struct IDSet<T: Identifiable>
{
public typealias Element = T
internal typealias Storage = [Element.ID: Element]
private var storage = Storage()
public mutating func insert(_ value: Element) -> Bool
{
guard !storage.keys.contains(value.id) else { return false }
storage[value.id] = value
return true
}
}
Because IDSet is generic, you can make one for any kind of Identifiable thing. It also doesn't need for the whole element to be Hashable. It only needs its id to be Hashable. That gives you more flexibility in the kinds of elements you store in it.
Presumably you'd want to iterate over the elements, so you'd need to make it conform to Sequence. Again there's no need to get complicated in making an Iterator, because you can leverage Dictionary's Iterator:
extension IDSet: Sequence
{
public struct Iterator: IteratorProtocol
{
internal var iter: Storage.Iterator
public mutating func next() -> Element?
{
guard let (_, value) = iter.next() else { return nil }
return value
}
}
public func makeIterator() -> Iterator {
Iterator(iter: storage.makeIterator())
}
}
You probably want IDSet to conform to some other protocols too, but I think it's likely you can see how that would work.
To use it, it's much like Set, although insert returns a non-discardable Bool indicating whether the element was inserted. Let's say you want to test this behavior in XCTest using your example data:
var someSet = IDSet<SomeStruct>()
XCTAssertTrue(someSet.insert(SomeStruct(id: "1", date: "22.09.2022", comment: "nothing here")))
XCTAssertFalse(someSet.insert(SomeStruct(id: "1", date: "05.12.1978", comment: "something here!")))
And of course, because it conforms to Sequence you can iterate over it:
for s in someSet {
print("id: \(s.id), date: \(s.date), comment: \"\(s.comment)\"")
}
Or use various other Sequence methods:
let comments = someSet.map { $0.comment }

Set elements are Hashable
Hashing a value means feeding its essential components into a hash function, represented by the Hasher type. Essential components are those that contribute to the type’s implementation of Equatable. Two instances that are equal must feed the same values to Hasher in hash(into:), in the same order.
I suggest adding:
struct SomeStruct: Hashable {
let id: String;
let date: String;
let comment: String;
static func ==(lhs: SomeStruct, rhs: SomeStruct) -> Bool {
return lhs.id == rhs.id
}
public func hash(into hasher: inout Hasher) {
// feed the only property that contribute to Equatable implementation
hasher.combine(id)
}
}

Related

Is it right to conform Hashable by only taking id into consideration?

A lot of online example I have came across, when they try to conform to Hashable, they only take id as consideration. For instance https://www.raywenderlich.com/8241072-ios-tutorial-collection-view-and-diffable-data-source , https://medium.com/#JoyceMatos/hashable-protocols-in-swift-baf0cabeaebd , ...
/// Copyright (c) 2020 Razeware LLC
///
/// Permission is hereby granted, free of charge, to any person obtaining a copy
/// of this software and associated documentation files (the "Software"), to deal
/// in the Software without restriction, including without limitation the rights
/// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
/// copies of the Software, and to permit persons to whom the Software is
/// furnished to do so, subject to the following conditions:
///
/// The above copyright notice and this permission notice shall be included in
/// all copies or substantial portions of the Software.
///
/// Notwithstanding the foregoing, you may not use, copy, modify, merge, publish,
/// distribute, sublicense, create a derivative work, and/or sell copies of the
/// Software in any work that is designed, intended, or marketed for pedagogical or
/// instructional purposes related to programming, coding, application development,
/// or information technology. Permission for such use, copying, modification,
/// merger, publication, distribution, sublicensing, creation of derivative works,
/// or sale is expressly withheld.
///
/// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
/// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
/// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
/// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
/// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
/// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
/// THE SOFTWARE.
import UIKit
class Video: Hashable {
var id = UUID()
var title: String
var thumbnail: UIImage?
var lessonCount: Int
var link: URL?
init(title: String, thumbnail: UIImage? = nil, lessonCount: Int, link: URL?) {
self.title = title
self.thumbnail = thumbnail
self.lessonCount = lessonCount
self.link = link
}
// 1
func hash(into hasher: inout Hasher) {
// 2
hasher.combine(id)
}
// 3
static func == (lhs: Video, rhs: Video) -> Bool {
lhs.id == rhs.id
}
}
I was wondering, is that ever a correct way to conform Hashable? I thought we should take all class member variables, into consideration?
For instance, by using only id in func hash/ func ==, will yield the following misbehaviour.
We are going encounter 2 objects with different content, but func == will return true when comparing 2 objects with different content.
struct Dog: Hashable {
let id = UUID()
var name: String
var age: Int
init(name: String, age: Int) {
self.name = name
self.age = age
}
func hash(into hasher: inout Hasher) {
hasher.combine(id)
}
static func == (lhs: Dog, rhs: Dog) -> Bool {
lhs.id == rhs.id
}
}
var dog0 = Dog(name: "dog", age: 1)
var dog1 = dog0
/*
dog0 is -5743610764084706839, dog, 1
dog1 is -5743610764084706839, dog, 1
compare dog0 with dog1 is true
*/
print("dog0 is \(dog0.hashValue), \(dog0.name), \(dog0.age)")
print("dog1 is \(dog1.hashValue), \(dog1.name), \(dog1.age)")
print("compare dog0 with dog1 is \(dog0 == dog1)")
dog1.name = "another name"
dog1.age = 9
// Same id, but different content!
/*
dog0 is -5743610764084706839, dog, 1
dog1 is -5743610764084706839, another name, 9
compare dog0 with dog1 is true
*/
print("dog0 is \(dog0.hashValue), \(dog0.name), \(dog0.age)")
print("dog1 is \(dog1.hashValue), \(dog1.name), \(dog1.age)")
print("compare dog0 with dog1 is \(dog0 == dog1)")
I was wondering, is it right to conform Hashable by only taking id into consideration?
p/s
I try to look from other languages like Java, on what is the general advice regarding hash code generation . This is what is being written in their popular Effective Java book.
Do not be tempted to exclude significant fields from the hash code
computation to improve performance. While the resulting hash function
may run faster, its poor quality may degrade hash tables’ performance
to the point where they become unusable. In particular, the hash
function may be confronted with a large collection of instances that
differ mainly in regions you’ve chosen to ignore. If this happens, the
hash function will map all these instances to a few hash codes, and
programs that should run in linear time will instead run in quadratic
time. This is not just a theoretical problem. Prior to Java 2, the
String hash function used at most sixteen characters evenly spaced
throughout the string, starting with the first character. For large
collections of hierarchical names, such as URLs, this function
displayed exactly the pathological behavior described earlier.
TL;DR: This hash function is unnecessary, but legal, and arguably ideal. This == is incorrect, despite being common in tutorials, because it breaks substitutability which is required by Equatable, exactly as you suggest.
However, as matt notes, diffable data sources may require this anyway. That doesn't make it good, but it may make it necessary. (Do read all of matt's comments below. They provide a lot of important context. In reference specifically to diffable data sources, see his answer; I am not particularly familiar with diffable data sources.)
I suggest turning to the documentation, which lays this out.
First, Hashable:
Hashing a value means feeding its essential components into a hash function, represented by the Hasher type. Essential components are those that contribute to the type’s implementation of Equatable. Two instances that are equal must feed the same values to Hasher in hash(into:), in the same order.
The most important thing is that Hashable be consistent with Equatable. Two things must never be equal, but have different hashes.
The converse is not true. It is completely valid for two unequal things to have the same hash. In fact, that's a fundamental fact of hashing called the pigeonhole principle. A good hash improves performance by avoiding unnecessary equality checks. But the following hash(into:) function is always valid:
func hash(into hasher: inout Hasher) {
hasher.combine(0)
}
This just means that every value has the same hash, and so the system will always call ==. This is bad for performance (and in server applications that can translate into a denial of service attack called hash flooding). But it is legal.
If that's legal, certainly just hashing id is legal.
But....
That brings us to Equatable and its docs, and the most important paragraph (emphasis added):
Equality implies substitutability—any two instances that compare equally can be used interchangeably in any code that depends on their values. To maintain substitutability, the == operator should take into account all visible aspects of an Equatable type. Exposing nonvalue aspects of Equatable types other than class identity is discouraged, and any that are exposed should be explicitly pointed out in documentation.
A value must only be considered equal if they can be substituted for each other in any context, and it will not impact the correctness of the program. Clearly in your example, that's not true. In fact, it's never going to be true for a type with mutable public properties (despite many tutorials that get this wrong). So your == is incorrect. But your hash function is fine, arguably ideal. Its goal is to be a quick check for non-equality that minimizes collisions. If the ids are the same, you still have to check the rest of the values, but if they're different, you know it's not going to be equal.
If your Dog type were immutable (name and age were let rather than var), it might be acceptable to implement == this way. It's impossible to set the id by hand, so it would be impossible to get two values with the same id but different values. But I wouldn't do that unless you could show a significant performance boost. It hangs correctness on too subtle a requirement. For example, if an extension added an init that allowed setting id directly, it would make your == invalid. That's too fragile IMO.
How about private mutable state? As long as that is only for performance purposes (memoization/caching), then it's fine to leave out of == (and hash). But if that internal state can influence externally visible behavior, than it needs to be part of ==.
The good news is, most of the time you don't need to worry. Swift's automatic implementations handle this for you correctly out of the box, and compare all properties. So in your Dog example, the best solution is to just remove the methods (I'm sure you're aware of that; just stating it for folks reading along). Whenever possible, I highly recommend using the default conformances for Hashable and avoid writing your own.
But in cases where you have to implement your own, the rules are simple:
Two equal values must be perfectly substitutable in all cases without impacting correctness (though a substitution may impact performance)
Two equal values must always have the same hash
The guidelines are also fairly simple: Hashing should be fast, while minimizing collisions.
The one argument I've seen for these incorrect implementations of == is to try to make Set work nicely. IMO, this is a misuse of Set and Equatable, and is not promised to work in expected ways (if you insert a duplicate value with the same identifier, but different properties, it's undefined which of the values will be in the collection). You should not twist Equatable around wanting to use a specific data structure. You should use the data structure that matches your meaning.
In the common case, the right tool is Dictionary as [ID: Value]. It expresses what you really mean: a mapping between an ID and a single value for that ID, rather than an unordered bag of unique values.
There is likely a memory cost in using a Dictionary rather than a Set (since you have to duplicate the ID). But you should only try to work around that after proving there's a problem to be solved.
Also, see matt's comment below. I have not spent a lot of time with the new diffable data sources. I remember when I first saw them that I was concerned they might be misusing of Equatable. If that's true, then you may have to misuse Equatable to use them, and that would explain some tutorials that do it this way. That doesn't make it good Swift, but it may be required by Apple frameworks.
As I've studied Apple's code more (see matt's answer for many), I've noticed that they all follow the rule I discussed above: they are immutable and you cannot set the UUID during init. This construction makes it impossible for two values to have the same id but other values be different, so checking the id is always sufficient. But if you make the values mutable, or you allow the id to be anything other than let id = UUID(), then this construction becomes dangerous.
That is completely fine. There is only one requirement for Hashable: If a == b then a.hashValue == b.hashValue must also be true. This is fulfilled here, so your struct will work as a dictionary key or as a set member.
Note that this also is fulfilled, if your hash(into:) doesn’t combine any data (or only constant data) into the hasher. This will make hash table lookups slow, but they will still work.
Another option is to compare all fields in your == implementation but only use a subset of them for hashing in hash(into:). That still follows the rules (the other way around is not allowed of course). This may be useful as a performance optimization, but it also may hurt performance. Depends on the distribution of the data you are hashing.
Whether it is correct or not to only use a subset of properties for a Hashable conformance completely depends on your requirements.
If for a certain object, equality is really only defined by a single variable (or a subset of variables), than it is correct to use that subset of variables for the Hashable (and Equatable conformances).
However, if all properties of a type are required to decide whether two instances are equal or not, then you should use all properties.
It is fine to have a type with multiple properties, including a UUID, where the conformance to Hashable and Equatable depends solely on the UUID and not on any of the other properties. Apple uses this pattern in their own code. Download Apple's example code from here:
https://docs-assets.developer.apple.com/published/6840986f9a/ImplementingModernCollectionViews.zip
Look at the WiFiController.Network struct, the MountainsController.Mountain struct, the OutlineViewController.OutlineItem class, and the InsertionSortArray.SortNode struct. They all do exactly this same thing. So, all of this code is by Apple:
struct Network: Hashable {
let name: String
let identifier = UUID()
func hash(into hasher: inout Hasher) {
hasher.combine(identifier)
}
static func == (lhs: Network, rhs: Network) -> Bool {
return lhs.identifier == rhs.identifier
}
}
struct Mountain: Hashable {
let name: String
let height: Int
let identifier = UUID()
func hash(into hasher: inout Hasher) {
hasher.combine(identifier)
}
static func == (lhs: Mountain, rhs: Mountain) -> Bool {
return lhs.identifier == rhs.identifier
}
func contains(_ filter: String?) -> Bool {
guard let filterText = filter else { return true }
if filterText.isEmpty { return true }
let lowercasedFilter = filterText.lowercased()
return name.lowercased().contains(lowercasedFilter)
}
}
class OutlineItem: Hashable {
let title: String
let subitems: [OutlineItem]
let outlineViewController: UIViewController.Type?
init(title: String,
viewController: UIViewController.Type? = nil,
subitems: [OutlineItem] = []) {
self.title = title
self.subitems = subitems
self.outlineViewController = viewController
}
func hash(into hasher: inout Hasher) {
hasher.combine(identifier)
}
static func == (lhs: OutlineItem, rhs: OutlineItem) -> Bool {
return lhs.identifier == rhs.identifier
}
private let identifier = UUID()
}
struct SortNode: Hashable {
let value: Int
let color: UIColor
init(value: Int, maxValue: Int) {
self.value = value
let hue = CGFloat(value) / CGFloat(maxValue)
self.color = UIColor(hue: hue, saturation: 1.0, brightness: 1.0, alpha: 1.0)
}
private let identifier = UUID()
func hash(into hasher: inout Hasher) {
hasher.combine(identifier)
}
static func == (lhs: SortNode, rhs: SortNode) -> Bool {
return lhs.identifier == rhs.identifier
}
}
Your suspicion is correct. The question of whether it's right (as you put it) is a matter of the domain. Excellent explanations have been given about the technicalities of hashing and equality.
Since your question has touched several times on DiffableDataSource you should be very cautious about designing your domain logic to suit the needs of the UI framework. Doing so violates the dependency inversion and open-closed principles.
Create a local data structure to use as the data source's item identifier, and copy only the properties you intentionally decide should trigger a reload.
typealias DataSource = UICollectionViewDiffableDataSource<Int, DogItem>
struct DogItem: Hashable {
var name: String // the cell will reload whenever the name changes
}
vs
struct DogItem: Hashable {
var id: Dog.ID // the cell will never change until explicitly told
}
As was mentioned, a common solution is to use a dictionary that maps an id to a value:
var items = [UUID:Item]()
However this might allow you to map the wrong value:
var item1 = Item(id: UUID())
var item2 = Item(id: UUID())
items[item1.id] = item2
Instead, create a reusable hashing wrapper data structure to explicitly use the identifier.
struct IDMap<T: Identifiable & Hashable> {
private var _items = [T.ID : T]()
init() { }
subscript(index: T.ID) -> T? {
get { return _items[index] }
}
mutating func insert(_ item: T) {
_items[item.id] = item
}
mutating func remove(_ item: T) {
_items.removeValue(forKey: item.id)
}
var items: Set<T> {
return Set(_items.values)
}
}
This is true. your code has one requirement for hashable, it compare only dog.id == dog1.id when you use dog == dog1.
if you want to check all field of struct then compare that field in == method.
static func == (lhs: Dog, rhs: Dog) -> Bool {
lhs.id == rhs.id && lhs.name == rhs.name && lhs.age == rhs.age
}

Remove duplicate class instances based on object identity

I'm writing a component for caching instances of classes. The classes are not per se Comparable, Hashable or Equatable. If they were, the semantics of the respective operations would not necessarily serve our purposes, so let's pretent we can not use those protocols.
Objects can be cached w.r.t. multiple keys. So when asking the cache for a list of all cached objects, I need to remove duplicates from the value set of the underlying dictionary -- with respect to object identity.
Obviously, this does the job:
var result: [C] = []
for c in dict.values {
if !result.contains(where: { (rc: C) in rc === c }) {
result.append(c)
}
}
return result
However, this has quadratic runtime behaviour. Compared to linearithmic or expected linear behaviour that are easy to get when using abovementioned protocols (using set implementations), this is bad.
So how can we efficiently remove duplicates w.r.t. object identity from a Swift collection?
We can wrap our objects into something that is Hashable and Comparable:
struct ClassWrap<T: AnyObject>: Hashable, Comparable {
var value: T
var hashValue: Int {
return ObjectIdentifier(self.value).hashValue
}
static func ==(lhs: ClassWrap, rhs: ClassWrap) -> Bool {
return lhs.value === rhs.value
}
static func <(lhs: ClassWrap<T>, rhs: ClassWrap<T>) -> Bool {
return ObjectIdentifier(lhs.value) < ObjectIdentifier(rhs.value)
}
}
Now, any regular Set implementation or otherwise unique-fying operation should do the job.

Using diff in an array of objects that conform to a protocol

I'm experimenting with using Composition instead of Inheritance and I wanted to use diff on an array of objects that comply with a given protocol.
To do so, I implemented a protocol and made it comply with Equatable:
// Playground - noun: a place where people can play
import XCPlayground
import Foundation
protocol Field:Equatable {
var content: String { get }
}
func ==<T: Field>(lhs: T, rhs: T) -> Bool {
return lhs.content == rhs.content
}
func ==<T: Field, U: Field>(lhs: T, rhs: U) -> Bool {
return lhs.content == rhs.content
}
struct First:Field {
let content:String
}
struct Second:Field {
let content:String
}
let items:[Field] = [First(content: "abc"), Second(content: "cxz")] // 💥 boom
But I've soon discovered that:
error: protocol 'Field' can only be used as a generic constraint because it has Self or associated type requirements
I understand why since Swift is a type-safe language that needs to be able to know the concrete type of these objects at anytime.
After tinkering around, I ended up removing Equatable from the protocol and overloading the == operator:
// Playground - noun: a place where people can play
import XCPlayground
import Foundation
protocol Field {
var content: String { get }
}
func ==(lhs: Field, rhs: Field) -> Bool {
return lhs.content == rhs.content
}
func ==(lhs: [Field], rhs: [Field]) -> Bool {
return (lhs.count == rhs.count) && (zip(lhs, rhs).map(==).reduce(true, { $0 && $1 })) // naive, but let's go with it for the sake of the argument
}
struct First:Field {
let content:String
}
struct Second:Field {
let content:String
}
// Requirement #1: direct object comparison
print(First(content: "abc") == First(content: "abc")) // true
print(First(content: "abc") == Second(content: "abc")) // false
// Requirement #2: being able to diff an array of objects complying with the Field protocol
let array1:[Field] = [First(content: "abc"), Second(content: "abc")]
let array2:[Field] = [Second(content: "abc")]
print(array1 == array2) // false
let outcome = array1.diff(array2) // 💥 boom
error: value of type '[Field]' has no member 'diff'
From here on, I'm a bit lost to be honest. I read some great posts about type erasure but even the provided examples suffered from the same issue (which I assume is the lack of conformance to Equatable).
Am I right? And if so, how can this be done?
UPDATE:
I had to stop this experiment for a while and totally forgot about a dependency, sorry! Diff is a method provided by SwiftLCS, an implementation of the longest common subsequence (LCS) algorithm.
TL;DR:
The Field protocol needs to comply with Equatable but so far I have not been able to do this. I need to be able to create an array of objects that comply to this protocol (see the error in the first code block).
Thanks again
The problem comes from a combination of the meaning of the Equatable protocol and Swift’s support for type overloaded functions.
Let’s take a look at the Equatable protocol:
protocol Equatable
{
static func ==(Self, Self) -> Bool
}
What does this mean? Well it’s important to understand what “equatable” actually means in the context of Swift. “Equatable” is a trait of a structure or class that make it so that any instance of that structure or class can be compared for equality with any other instance of that structure or class. It says nothing about comparing it for equality with an instance of a different class or structure.
Think about it. Int and String are both types that are Equatable. 13 == 13 and "meredith" == "meredith". But does 13 == "meredith"?
The Equatable protocol only cares about when both things to be compared are of the same type. It says nothing about what happens when the two things are of different types. That’s why both arguments in the definition of ==(::) are of type Self.
Let’s look at what happened in your example.
protocol Field:Equatable
{
var content:String { get }
}
func ==<T:Field>(lhs:T, rhs:T) -> Bool
{
return lhs.content == rhs.content
}
func ==<T:Field, U:Field>(lhs:T, rhs:U) -> Bool
{
return lhs.content == rhs.content
}
You provided two overloads for the == operator. But only the first one has to do with Equatable conformance. The second overload is the one that gets applied when you do
First(content: "abc") == Second(content: "abc")
which has nothing to do with the Equatable protocol.
Here’s a point of confusion. Equability across instances of the same type is a lower requirement than equability across instances of different types when we’re talking about individually bound instances of types you want to test for equality. (Since we can assume both things being tested are of the same type.)
However, when we make an array of things that conform to Equatable, this is a higher requirement than making an array of things that can be tested for equality, since what you are saying is that every item in the array can be compared as if they were both of the same type. But since your structs are of different types, you can’t guarantee this, and so the code fails to compile.
Here’s another way to think of it.
Protocols without associated type requirements, and protocols with associated type requirements are really two different animals. Protocols without Self basically look and behave like types. Protocols with Self are traits that types themselves conform to. In essence, they go “up a level”, like a type of type. (Related in concept to metatypes.)
That’s why it makes no sense to write something like this:
let array:[Equatable] = [5, "a", false]
You can write this:
let array:[Int] = [5, 6, 7]
or this:
let array:[String] = ["a", "b", "c"]
or this:
let array:[Bool] = [false, true, false]
Because Int, String, and Bool are types. Equatable isn’t a type, it’s a type of a type.
It would make “sense” to write something like this…
let array:[Equatable] = [Int.self, String.self, Bool.self]
though this is really stretching the bounds of type-safe programming and so Swift doesn’t allow this. You’d need a fully flexible metatyping system like Python’s to express an idea like that.
So how do we solve your problem? Well, first of all realize that the only reason it makes sense to apply SwiftLCS on your array is because, at some level, all of your array elements can be reduced to an array of keys that are all of the same Equatable type. In this case, it’s String, since you can get an array keys:[String] by doing [Field](...).map{ $0.content }. Perhaps if we redesigned SwiftLCS, this would make a better interface for it.
However, since we can only compare our array of Fields directly, we need to make sure they can all be upcast to the same type, and the way to do that is with inheritance.
class Field:Equatable
{
let content:String
static func == (lhs:Field, rhs:Field) -> Bool
{
return lhs.content == rhs.content
}
init(_ content:String)
{
self.content = content
}
}
class First:Field
{
init(content:String)
{
super.init(content)
}
}
class Second:Field
{
init(content:String)
{
super.init(content)
}
}
let items:[Field] = [First(content: "abc"), Second(content: "cxz")]
The array then upcasts them all to type Field which is Equatable.
By the way, ironically, the “protocol-oriented” solution to this problem actually still involves inheritance. The SwiftLCS API would provide a protocol like
protocol LCSElement
{
associatedtype Key:Equatable
var key:Key { get }
}
We would specialize it with a superclass
class Field:LCSElement
{
let key:String // <- this is what specializes Key to a concrete type
static func == (lhs:Field, rhs:Field) -> Bool
{
return lhs.key == rhs.key
}
init(_ key:String)
{
self.key = key
}
}
and the library would use it as
func LCS<T: LCSElement>(array:[T])
{
array[0].key == array[1].key
...
}
Protocols and Inheritance are not opposites or substitutes for one another. They complement each other.
I know this is probably now what you want but the only way I know how to make it work is to introduce additional wrapper class:
struct FieldEquatableWrapper: Equatable {
let wrapped: Field
public static func ==(lhs: FieldEquatableWrapper, rhs: FieldEquatableWrapper) -> Bool {
return lhs.wrapped.content == rhs.wrapped.content
}
public static func diff(_ coll: [Field], _ otherCollection: [Field]) -> Diff<Int> {
let w1 = coll.map({ FieldEquatableWrapper(wrapped: $0) })
let w2 = otherCollection.map({ FieldEquatableWrapper(wrapped: $0) })
return w1.diff(w2)
}
}
and then you can do
let outcome = FieldEquatableWrapper.diff(array1, array2)
I don't think you can make Field to conform to Equatable at all as it is designed to be "type-safe" using Self pseudo-class. And this is one reason for the wrapper class. Unfortunately there seems to be one more issue that I don't know how to fix: I can't put this "wrapped" diff into Collection or Array extension and still make it support heterogenous [Field] array without compilation error:
using 'Field' as a concrete type conforming to protocol 'Field' is not supported
If anyone knows a better solution, I'm interested as well.
P.S.
In the question you mention that
print(First(content: "abc") == Second(content: "abc")) // false
but I expect that to be true given the way you defined your == operator

Extension for sequences of dictionaries where the values are Equatable

I tried to implement the following method to remove double entries in an array of dictionaries by comparing their specific keys. However, this extension method will not work due to the error:
Binary operator == cannot be applied to two 'Equatable' operands
These are obviously equatable and same type (Iterator.Element.Value), so why doesn't it work?
I see that it treats Equatable as a specific type, not a constraint. I could not make it work with generic type or by writing where Iterator.Element == [String: Any], Iterator.Element.Value: Equatable.
Do you guys have any clues about how to solve this?
extension Sequence where Iterator.Element == [String: Equatable] {
public func removeDoubles(byKey uniqueKey: String) -> [Iterator.Element] {
var uniqueValues: [Iterator.Element.Value] = []
var noDoubles: [Iterator.Element] = []
for item in self {
if let itemValue = item[uniqueKey] {
if (uniqueValues.contains { element in
return itemValue == element
}) {
uniqueValues.append(itemValue)
noDoubles.append(item)
}
}
}
return noDoubles
}
}
A [String: Equatable] is a mapping of strings to any Equatable type. There is no promise that each value be the same equatable type. That said, it's not actually possible to create such a dictionary (since Equatable has an associated type), so this extension cannot apply to any actual type in Swift. (The fact that you don't receive an error here is IMO a bug in the compiler.)
The feature you'd need to make this work is SE-0142, which is accepted, but not implemented. You currently cannot constrain an extension based on type constraints this way.
There are many ways to achieve what you're trying to do. One straightforward way is to pass your equality function:
extension Sequence {
public func removeDoubles(with equal: (Iterator.Element, Iterator.Element) -> Bool) -> [Iterator.Element] {
var noDoubles: [Iterator.Element] = []
for item in self {
if !noDoubles.contains(where: { equal($0, item) }) {
noDoubles.append(item)
}
}
return noDoubles
}
}
let noDupes = dict.removeDoubles(with: { $0["name"] == $1["name"] })
This is slightly different than your code in how it behaves when name is missing, but slight tweaks could get what you want.
That said, the need for this strongly suggests an incorrect data model. If you have this sequence of dictionaries, and you're trying to build an extension on that, you almost certainly meant to have a sequence of structs. Then this becomes more straightforward. The point of a dictionary is an arbitrary mapping of keys to values. If you have a small set of known keys that are legal, that's really a struct.

Check if a type implements a protocol

I am writing a library that creates extensions for default Swift types.
I would like to have a check on my Array extensions whether a certain type implements a certain protocol. See this method for example:
extension Array {
/// Compares the items using the given comparer and only returns non-equal values
/// :returns: the first items that are unique according to the comparer
func distinct(comparer: (T, T) -> Bool) -> [T] {
var result: [T] = []
outerLoop: for item in self {
for resultItem in result {
if comparer(item, resultItem) {
continue outerLoop
}
}
result.append(item)
}
return result
}
}
Now I'd like to rewrite this method to check if T is Equatable as such:
/// Compares the items using the given comparer and only returns non-equal values
/// :returns: the first items that are unique according to the comparer
func distinct(comparer: ((T, T) -> Bool)?) -> [T] {
var result: [T] = []
outerLoop: for item in self {
for resultItem in result {
if isEquatable ? comparer!(item, resultItem) : item == resultItem {
continue outerLoop
}
}
result.append(item)
}
return result
}
where isEquatable is a Bool value that tells me if T is Equatable. How can I find this out?
There isn’t a good way to do this in Swift at the moment.* This is why functions like sorted are either free-functions, or in the case of the member, take a predicate. The main problem with the test-and-cast approach you’re looking for is that Equatable and similar protocols have an associated type or rely on Self, and so can only be used inside a generic function as a constraint.
I’m guessing your goal is that the caller can skip supplying the comparator function, and so it will fall back to Equatable if available? And crash if it isn’t? The problem here is that the function is determining something at run time (the argument is Equatable) when this really ought to be determinable at compile time. This is not great - it’s much better to determine these things fully at compile time.
So you can write a free function that requires Equatable:
func distinct<C: CollectionType where C.Generator.Element: Equatable>
(source: C) -> [C.Generator.Element] {
var seen: [C.Generator.Element] = []
return filter(source) {
if contains(seen, $0) {
return false
}
else {
seen.append($0)
return true
}
}
}
let uniques = distinct([1,2,3,1,1,2]) // [1,2,3]
and then if you tried to call it with something that wasn’t comparable, you’d get a compile-time error:
let incomparable = [1,2,3] as [Any]
distinct(incomparable) // compiler barfs - Any isn’t Equatable
With the runtime approach, you’d only find this out when you ran the program.
The good news is, there are upsides too. The problem with searching an array for each element is the function will be very slow for large arrays, because for every element, the list of already-seen elements must be searched linearly. If you overload distinct with another version that requires the elements be Hashable (which Equatable things often are), you can use a set to track them:
func distinct<C: CollectionType where C.Generator.Element: Hashable>
(source: C) -> [C.Generator.Element] {
var seen: Set<C.Generator.Element> = []
return filter(source) {
if seen.contains($0) {
return false
}
else {
seen.insert($0)
return true
}
}
}
At compile time, the compiler will choose the best possible version of the function and use that. If your thing is hashable, that version gets picked, if it’s only equatable, it’ll use the slower one (this is because Hashable inherits from Equatable, and the compiler picks the more specialized function). Doing this at compile time instead of run time means you pay no penalty for the check, it’s all determined up front.
*there are ugly ways, but since the goal is appealing syntax, what’s the point… Perhaps the next version will allow constraints on methods, which would be nice.