Does Swift know not to initialize an object if my set already contains it? - swift

I have a global var notes: Set<Note> that contains notes initialized with downloaded data.
In the code below, does Swift know to skip the initialization of my Note object if notes already contains it?
for dictionary in downloadedNoteDictionaries {
let note = Note(dictionary: dictionary)
notes.insert(note)
}
I'm wondering because my app downloads dozens of notes per request and initializing a Note object seems rather computationally expensive.
If the answer to my question is no, then how could I improve my code's performance?
My Note class—which I just realized should probably be a struct instead—has the property let id: Int64 as its sole essential component, but apparently, you can't access an element of a set by its hash value? I don't want to use Set's instance method first(where:) because it has a complexity of O(n), and notes could contain millions of Note objects.

You cannot rely on Swift to eliminate the construction of a new Note in your code. Your Set needs to ask the Note for its hashValue, and may need to call == with your Note as an argument. Those computations require the Note object. Possibly if Swift can inline everything, it can notice that your hashValue and == depend only on the id property, but it is certainly not guaranteed to notice or to act on that information.
It sounds like you should be using an [Int64: Note] instead of a Set<Note>.

No, Swift will not avoid creating the new Note object. The problem here is trying to determine if an object already exists in a set. In order to check if an object already exists in the set, you must have some way to identify this object consistently and have that identification persist across future reads and writes to that set. Assuming we want to adopt Swift's hashing enhancements which deprecates the old methods for having to manually provide a hashValue for any object conforming to the Hashable, we should not use a Set as a solution to this problem. The reason is because Swift's new recommended hashing methods use a random seed to generate hashes for added security. Depending on hash values to identify an element in a set alone would therefore not be possible.
It seems that your method of identifying these objects are by an id. I would do as Rob suggests and use a dictionary where the keys are the id. This would help you answer existential questions in order avoid instantiating a Note object.
If you need the resulting dictionary as a Set, you can still create a new Set from this resulting dictionary sequence.
It is possible to pull out a particular element from a set as long as you know how to identify it, and the operations would be O(1). You would use these two methods:
https://developer.apple.com/documentation/swift/set/2996833-firstindex
https://developer.apple.com/documentation/swift/set/2830801-subscript
Here is an example that I was able to run in a playground. However, this assumes that you have a way to identify in your data / model the id of the Note object beforehand in order to avoid creating the Note object. In this case, I am assuming that a Note shares an id with a Dictionary it holds:
import UIKit
struct DictionaryThatNoteUses {
var id: String
init(id: String = UUID().uuidString) {
self.id = id
}
}
struct Note: Hashable {
let dictionary: DictionaryThatNoteUses
var id: String {
return dictionary.id
}
func hash(into hasher: inout Hasher) {
hasher.combine(id)
}
static func == (lhs: Note, rhs: Note) -> Bool {
return lhs.id == rhs.id
}
}
var downloadedNoteDictionaries: [DictionaryThatNoteUses] = [
DictionaryThatNoteUses(),
DictionaryThatNoteUses(),
DictionaryThatNoteUses()
]
var notesDictionary: [String: Note] = [:]
var notesSet: Set<Note> = []
// Add a dictionary with the same id as the first dictionary
downloadedNoteDictionaries.append(downloadedNoteDictionaries.first!) // so now there are four Dictionary objects in this array
func createDictionaryOfNotes() {
func avoidCreatingDuplicateNotesObject(with id: String) {
print("avoided creating duplicate notes object with id \(id)") // prints once because of the duplicated note at the end matching the first, so a new Note object is not instantiated
}
downloadedNoteDictionaries.forEach {
guard notesDictionary[$0.id] == nil else { return avoidCreatingDuplicateNotesObject(with: $0.id) }
let note = Note(dictionary: $0)
notesDictionary[note.id] = note
}
}
createDictionaryOfNotes()
// Obtain a set for set operations on those unique Note objects
notesSet = Set<Note>(notesDictionary.values)
print("number of items in dictionary = \(notesDictionary.count), number of items in set = \(notesSet.count)") // prints 3 and 3 because the 4th object was a duplicate
// Grabbing a specific element from the set
// Let's test with the second Note object from the notesDictionary
let secondNotesObjectFromDictionary = notesDictionary.values[notesDictionary.values.index(notesDictionary.values.startIndex, offsetBy: 1)]
let thirdNotesObjectFromDictionary = notesDictionary.values[notesDictionary.values.index(notesDictionary.values.startIndex, offsetBy: 2)]
if let secondNotesObjectIndexInSet = notesSet.firstIndex(of: secondNotesObjectFromDictionary) {
print("do the two objects match: \(notesSet[secondNotesObjectIndexInSet] == secondNotesObjectFromDictionary)") // prints true
print("does the third object from dictionary match the second object from the set: \(thirdNotesObjectFromDictionary == notesSet[secondNotesObjectIndexInSet])") // prints false
}

Related

Forcing an Encoder's UnkeyedEncodingContainer to only contain one type of value

As part of a custom Encoder, I am coding an UnkeyedEncodingContainer. However, the specific format I am making it for asks that all elements of an array be of the same type. Specifically, arrays can contain :
Integers one same size
Floats or Doubles
Other arrays (not necessarily all containing the same kinds of elements)
Objects
Here is the type of answer I need : The basis of an UnkeyedEncodingContainer implementation that conforms to the protocol, and enforces that all elements be of one same type among the above specified ones.
As requested, here are examples of things that should or should not be encodable :
var valid1 = []
var valid2 = [3, 3, 5, 9]
var valid3 = ["string", "array"]
var invalid1 = [3, "test"]
var invalid2 = [5, []]
var invalid3 = [[3, 5], {"hello" : 3}]
// These may not all even be valid Swift arrays, they are only
// intended as examples
As an example, here is the best I have come up with, which does not work :
The UnkeyedEncodingContainer contains a function, checkCanEncode, and an instance variable, ElementType :
var elementType : ElementType {
if self.count == 0 {
return .None
} else {
return self.storage[0].containedType
}
}
func checkCanEncode(_ value : Any?, compatibleElementTypes : [ElementType]) throws {
guard compatibleElementTypes.contains(self.elementType) || self.elementType == .None else {
let context = EncodingError.Context(
codingPath: self.nestedCodingPath,
debugDescription: "Cannot encode value to an array of \(self.elementType)s"
)
throw EncodingError.invalidValue(value as Any, context)
}
}
// I know the .None is weird and could be replaced by an optional,
// but it is useful as its rawValue is 0. The Encoder has to encode
// the rawValue of the ElementType at some point, so using an optional
// would actually be more complicated
Everything is then encoded as a contained singleValueContainer :
func encode<T>(_ value: T) throws where T : Encodable {
let container = self.nestedSingleValueContainer()
try container.encode(value)
try checkCanEncode(value, compatibleElementTypes: [container.containedType])
}
// containedType is an instance variable of SingleValueContainer that is set
// when a value is encoded into it
But this causes an issue when it comes to nestedContainer and nestedUnkeyedContainer : (used for stored dictionaries and arrays respectively)
// This violates the protocol, this function should not be able to throw
func nestedContainer<NestedKey>(keyedBy keyType: NestedKey.Type) throws -> KeyedEncodingContainer<NestedKey> where NestedKey : CodingKey {
let container = KeyedContainer<NestedKey>(
codingPath: self.nestedCodingPath,
userInfo: self.userInfo
)
try checkCanEncode(container, compatibleElementTypes: [.Dictionary])
self.storage.append(container)
return KeyedEncodingContainer(container)
}
As you can see, since I need checkCanEncode to know whether it is even possible to create a NestedContainer in the first place (because if the array already has stuff inside that aren't dictionaries, then adding dictionaries to it is invalid), I have to make the function throw. But this breaks the UnkeyedEncodingContainer protocol which demands non-throwing versions.
But I can't just handle the error inside the function ! If something tries to put an array inside an array of integers, it must fail. Therefore this is an invalid solution.
Additional remarks :
Checking after having encoded the values already feels sketchy, but checking only when producing the final encoded payload is definitely a violation of the "No Zombies" principle (fail as soon as the program enters an invalid state) which I would rather avoid. However if no better solution is possible I may accept it as a last resort.
One other solution I have thought about is encoding the array as a dictionary with numbered keys, since dictionaries in this format may contain mixed types. However this is likely to pose decoding issues, so once again, it is a last resort.
You will be advised not to edit other people’s questions. If you have edits to suggest please do so in the comments, otherwise mind your own business
Unless anyone has a better idea, here is the best I could come up with :
Do not enforce that all elements be of the same type inside the UnkeyedEncodingContainer
If all elements are the same type, encode it as an array
If elements have varying types, encode it as a dictionary with integers as keys
This is completely fine as far as the encoding format goes, has minimal costs and only slightly complicates decoding (check whether keys contain integers) and greatly widens how many different Swift object will be compatible with the format.
Note : Remember that the "real" encoding step where the data is generated is not actually part of the protocol. That is where I am proposing the shenanigans should take place 😈
I don't know whether this helps you but in Swift you can overload functions. This means that you can declare functions with the same signature but with different parameter types or constraints. The compiler will take always the right choice. It's much more efficient than a type check at runtime.
The first method is called if the array conforms to Encodable
func encode<T: Encodable>(object: [T]) throws -> Data {
return try JSONEncoder().encode(object)
}
Otherwise the second method is called. If the array cannot even be encoded with JSONSerialization you can add custom encoding logic.
func encode<T>(object: [T]) throws -> Data {
do {
return try JSONSerialization.data(withJSONObject: object)
} catch {
// do custom encoding and return Data
return try myCustomEncoding(object)
}
}
This example
let array = [["1":1, "2":2]]
try encode(object: array)
calls the first method.
On the other hand this – even if the actual type is not heterogenous – calls the second method
let array : [[String:Any]] = [["1":1, "2":2]]
try encode(object: array)

Set's contains method returns different value at different time

I was thinking about how Swift ensures uniqueness for Set because I have turned one of my obj from Equatable to Hashable for free and so I came up with this simple Playground
struct SimpleStruct: Hashable {
let string: String
let number: Int
static func == (lhs: SimpleStruct, rhs: SimpleStruct) -> Bool {
let areEqual = lhs.string == rhs.string
print(lhs, rhs, areEqual)
return areEqual
}
}
var set = Set<SimpleStruct>()
let first = SimpleStruct(string: "a", number: 2)
set.insert(first)
So my first question was:
Will the static func == method be called anytime I insert a new obj inside the set?
My question comes from this thought:
For Equatable obj, in order to make this decision, the only way to ensure two obj are the same is to ask the result of static func ==.
For Hashable obj, a faster way is to compare hashValues... but, like in my case, the default implementation will use both string and number, in contrast with == logic.
So, in order to test how Set behaves, I have just added a print statement.
I have figured out that sometimes I got the print statement, sometimes no. Like sometimes hashValue isn't enough in order to make this decision ... So the method hasn't been called every time.
Weird...
So I've tried to add two objects that are equal and wondering what will be the result of set.contains
let second = SimpleStruct(string: "a", number: 3)
print(first == second) // returns true
set.contains(second)
And wonders of wonders, launching a couple of times the playground, I got different results and this might cause unpredictable results ...
Adding
var hashValue: Int {
return string.hashValue
}
it gets rid of any unexpected results but my doubt is:
Why, without the custom hashValue implementation, == sometimes gets called and sometimes it doesn't?
Should Apple avoid this kind of unexpected behaviours?
The synthesized implementation of the Hashable requirement uses all stored
properties of a struct, in your case string and number. Your implementation
of == is only based on the string:
let first = SimpleStruct(string: "a", number: 2)
let second = SimpleStruct(string: "a", number: 3)
print(first == second) // true
print(first.hashValue == second.hashValue) // false
This is a violation of a requirement of the Hashable protocol:
Two instances that are equal must feed the same values to Hasher in hash(into:), in the same order.
and causes the undefined behavior. (And since hash values are randomized
since Swift 4.2, the behavior can be different in each program run.)
What probably happens in your test is that the hash value of second is used to determine the “bucket” of the set in which the value
would be stored. That may or may not be the same bucket in which first is stored. – But that is an implementation detail: Undefined behavior is undefined behavior, it can cause unexpected results or even
runtime errors.
Implementing
var hashValue: Int {
return string.hashValue
}
or alternatively (starting with Swift 4.2)
func hash(into hasher: inout Hasher) {
hasher.combine(string)
}
fixes the rule violation, and therefore makes your code behave as expected.

Swift - Detecting whether item was inserted into NSMutableSet

This is more for interest rather than a problem, but I have an NSMutableSet, retrieved from UserDefaults and my objective is to append an item to it and then write it back. I am using an NSMutableSet because I only want unique items to be inserted.
The type of object to be inserted is a custom class, I have overrode hashCode and isEqual.
var stopSet: NSMutableSet = []
if let ud = UserDefaults.standard.object(forKey: "favStops") as? Data {
stopSet = NSKeyedUnarchiver.unarchiveObject(with: ud) as! NSMutableSet
}
stopSet.add(self.theStop!)
let outData = NSKeyedArchiver.archivedData(withRootObject: stopSet)
UserDefaults.standard.set(outData, forKey: "favStops")
NSLog("Saved to UserDefaults")
I get the set, call mySet.add(obj) and then write the set back to UserDefaults. Everything seems to work fine and (as far as I can see) there don't appear to be duplicates.
However is it possible to tell whether a call to mySet.add(obj) actually caused an item to be written to the set. mySet.add(obj) doesn't have a return value and if you use Playgrounds (rather than a project) you get in the output on the right hand side an indication of whether the set was actually changed based on the method call.
I know sets are not meant to store duplicate objects so in theory I should just trust that, but I was just wondering if the set did return a response that you could access - as opposed to just getting the length before the insert and after if I really wanted to know!
Swift has its own native type, Set, so you should use it instead of NSMutableSet.
Set's insert method actually returns a Bool indicating whether the insertion succeeded or not, which you can see in the function signature:
mutating func insert(_ newMember: Element) -> (inserted: Bool, memberAfterInsert: Element)
The following test code showcases this behaviour:
var set = Set<Int>()
let (inserted, element) = set.insert(0)
let (again, newElement) = set.insert(0)
print(inserted,element) //true, 0
print(again,oldElement) //false,0
The second value of the tuple returns the newly inserted element in case the insertion succeeded and the oldElement otherwise. oldElement is not necessarily equal in every aspect to the element you tried to insert. (since for custom types you might define the isEqual method in a way that doesn't compare each property of the type).
You don't need to handle the return value of the insert function, there is no compiler warning if you just write insert like this:
set.insert(1)

Swift semantics regarding dictionary access

I'm currently reading the excellent Advanced Swift book from objc.io, and I'm running into something that I don't understand.
If you run the following code in a playground, you will notice that when modifying a struct contained in a dictionary a copy is made by the subscript access, but then it appears that the original value in the dictionary is replaced by the copy. I don't understand why. What exactly is happening ?
Also, is there a way to avoid the copy ? According to the author of the book, there isn't, but I just want to be sure.
import Foundation
class Buffer {
let id = UUID()
var value = 0
func copy() -> Buffer {
let new = Buffer()
new.value = self.value
return new
}
}
struct COWStruct {
var buffer = Buffer()
init() { print("Creating \(buffer.id)") }
mutating func change() -> String {
if isKnownUniquelyReferenced(&buffer) {
buffer.value += 1
return "No copy \(buffer.id)"
} else {
let newBuffer = buffer.copy()
newBuffer.value += 1
buffer = newBuffer
return "Copy \(buffer.id)"
}
}
}
var array = [COWStruct()]
array[0].buffer.value
array[0].buffer.id
array[0].change()
array[0].buffer.value
array[0].buffer.id
var dict = ["key": COWStruct()]
dict["key"]?.buffer.value
dict["key"]?.buffer.id
dict["key"]?.change()
dict["key"]?.buffer.value
dict["key"]?.buffer.id
// If the above `change()` was made on a copy, why has the original value changed ?
// Did the copied & modified struct replace the original struct in the dictionary ?
dict["key"]?.change() // Copy
is semantically equivalent to:
if var value = dict["key"] {
value.change() // Copy
dict["key"] = value
}
The value is pulled out of the dictionary, unwrapped into a temporary, mutated, and then placed back into the dictionary.
Because there's now two references to the underlying buffer (one from our local temporary value, and one from the COWStruct instance in the dictionary itself) – we're forcing a copy of the underlying Buffer instance, as it's no longer uniquely referenced.
So, why doesn't
array[0].change() // No Copy
do the same thing? Surely the element should be pulled out of the array, mutated and then stuck back in, replacing the previous value?
The difference is that unlike Dictionary's subscript which comprises of a getter and setter, Array's subscript comprises of a getter and a special accessor called mutableAddressWithPinnedNativeOwner.
What this special accessor does is return a pointer to the element in the array's underlying buffer, along with an owner object to ensure that the buffer isn't deallocated from under the caller. Such an accessor is called an addressor, as it deals with addresses.
Therefore when you say:
array[0].change()
you're actually mutating the actual element in the array directly, rather than a temporary.
Such an addressor cannot be directly applied to Dictionary's subscript because it returns an Optional, and the underlying value isn't stored as an optional. So it currently has to be unwrapped with a temporary, as we cannot return a pointer to the value in storage.
In Swift 3, you can avoid copying your COWStruct's underlying Buffer by removing the value from the dictionary before mutating the temporary:
if var value = dict["key"] {
dict["key"] = nil
value.change() // No Copy
dict["key"] = value
}
As now only the temporary has a view onto the underlying Buffer instance.
And, as #dfri points out in the comments, this can be reduced down to:
if var value = dict.removeValue(forKey: "key") {
value.change() // No Copy
dict["key"] = value
}
saving on a hashing operation.
Additionally, for convenience, you may want to consider making this into an extension method:
extension Dictionary {
mutating func withValue<R>(
forKey key: Key, mutations: (inout Value) throws -> R
) rethrows -> R? {
guard var value = removeValue(forKey: key) else { return nil }
defer {
updateValue(value, forKey: key)
}
return try mutations(&value)
}
}
// ...
dict.withValue(forKey: "key") {
$0.change() // No copy
}
In Swift 4, you should be able to use the values property of Dictionary in order to perform a direct mutation of the value:
if let index = dict.index(forKey: "key") {
dict.values[index].change()
}
As the values property now returns a special Dictionary.Values mutable collection that has a subscript with an addressor (see SE-0154 for more info on this change).
However, currently (with the version of Swift 4 that ships with Xcode 9 beta 5), this still makes a copy. This is due to the fact that both the Dictionary and Dictionary.Values instances have a view onto the underlying buffer – as the values computed property is just implemented with a getter and setter that passes around a reference to the dictionary's buffer.
So when calling the addressor, a copy of the dictionary's buffer is triggered, therefore leading to two views onto COWStruct's Buffer instance, therefore triggering a copy of it upon change() being called.
I have filed a bug over this here. (Edit: This has now been fixed on master with the unofficial introduction of generalised accessors using coroutines, so will be fixed in Swift 5 – see below for more info).
In Swift 4.1, Dictionary's subscript(_:default:) now uses an addressor, so we can efficiently mutate values so long as we supply a default value to use in the mutation.
For example:
dict["key", default: COWStruct()].change() // No copy
The default: parameter uses #autoclosure such that the default value isn't evaluated if it isn't needed (such as in this case where we know there's a value for the key).
Swift 5 and beyond
With the unofficial introduction of generalised accessors in Swift 5, two new underscored accessors have been introduced, _read and _modify which use coroutines in order to yield a value back to the caller. For _modify, this can be an arbitrary mutable expression.
The use of coroutines is exciting because it means that a _modify accessor can now perform logic both before and after the mutation. This allows them to be much more efficient when it comes to copy-on-write types, as they can for example deinitialise the value in storage while yielding a temporary mutable copy of the value that's uniquely referenced to the caller (and then reinitialising the value in storage upon control returning to the callee).
The standard library has already updated many previously inefficient APIs to make use of the new _modify accessor – this includes Dictionary's subscript(_:) which can now yield a uniquely referenced value to the caller (using the deinitialisation trick I mentioned above).
The upshot of these changes means that:
dict["key"]?.change() // No copy
will be able to perform an mutation of the value without having to make a copy in Swift 5 (you can even try this out for yourself with a master snapshot).

How to handle hash collisions for Dictionaries in Swift

TLDR
My custom structure implements the Hashable Protocol. However, when hash collisions occur while inserting keys in a Dictionary, they are not automatically handled. How do I overcome this problem?
Background
I had previously asked this question How to implement the Hashable Protocol in Swift for an Int array (a custom string struct). Later I added my own answer, which seemed to be working.
However, recently I have detected a subtle problem with hashValue collisions when using a Dictionary.
Most basic example
I have simplified the code down as far as I can to the following example.
Custom structure
struct MyStructure: Hashable {
var id: Int
init(id: Int) {
self.id = id
}
var hashValue: Int {
get {
// contrived to produce a hashValue collision for id=1 and id=2
if id == 1 {
return 2
}
return id
}
}
}
func ==(lhs: MyStructure, rhs: MyStructure) -> Bool {
return lhs.hashValue == rhs.hashValue
}
Note the global function to overload the equality operator (==) in order to conform to the Equatable Protocol, which is required by the Hashable Protocol.
Subtle Dictionary key problem
If I create a Dictionary with MyStructure as the key
var dictionary = [MyStructure : String]()
let ok = MyStructure(id: 0) // hashValue = 0
let collision1 = MyStructure(id: 1) // hashValue = 2
let collision2 = MyStructure(id: 2) // hashValue = 2
dictionary[ok] = "some text"
dictionary[collision1] = "other text"
dictionary[collision2] = "more text"
print(dictionary) // [MyStructure(id: 2): more text, MyStructure(id: 0): some text]
print(dictionary.count) // 2
the equal hash values cause the collision1 key to be overwritten by the collision2 key. There is no warning. If such a collision only happened once in a dictionary with 100 keys, then it could easily be missed. (It took me quite a while to notice this problem.)
Obvious problem with Dictionary literal
If I repeat this with a dictionary literal, though, the problem becomes much more obvious because a fatal error is thrown.
let ok = MyStructure(id: 0) // hashValue = 0
let collision1 = MyStructure(id: 1) // hashValue = 2
let collision2 = MyStructure(id: 2) // hashValue = 2
let dictionaryLiteral = [
ok : "some text",
collision1 : "other text",
collision2 : "more text"
]
// fatal error: Dictionary literal contains duplicate keys
Question
I was under the impression that it was not necessary for hashValue to always return a unique value. For example, Mattt Thompson says,
One of the most common misconceptions about implementing a custom hash
function comes from ... thinking that hash values must be distinct.
And the respected SO user #Gaffa says that one way to handle hash collisions is to
Consider hash codes to be non-unique, and use an equality comparer for
the actual data to determine uniqueness.
In my opinion, the question Do swift hashable protocol hash functions need to return unique values? has not been adequately answered at the time of this writing.
After reading the Swift Dictionary question How are hash collisions handled?, I assumed that Swift automatically handled hash collisions with Dictionary. But apparently that is not true if I am using a custom class or structure.
This comment makes me think the answer is in how the Equatable protocol is implemented, but I am not sure how I should change it.
func ==(lhs: MyStructure, rhs: MyStructure) -> Bool {
return lhs.hashValue == rhs.hashValue
}
Is this function called for every dictionary key lookup or only when there is a hash collision? (Update: see this question)
What should I do to determine uniqueness when (and only when) a hash collision occurs?
func ==(lhs: MyStructure, rhs: MyStructure) -> Bool {
return lhs.hashValue == rhs.hashValue
}
Note the global function to overload the equality operator (==) in order to conform to the Equatable Protocol, which is required by the Hashable Protocol.
Your problem is an incorrect equality implementation.
A hash table (such as a Swift Dictionary or Set) requires separate equality and hash implementations.
hash gets you close to the object you're looking for; equality gets you the exact object you're looking for.
Your code uses the same implementation for hash and equality, and this will guarantee a collision.
To fix the problem, implement equality to match exact object values (however your model defines equality). E.g.:
func ==(lhs: MyStructure, rhs: MyStructure) -> Bool {
return lhs.id == rhs.id
}
I think you have all the pieces of the puzzle you need -- you just need to put them together. You have a bunch of great sources.
Hash collisions are okay. If a hash collision occurs, objects will be checked for equality instead (only against the objects with matching hashes). For this reason, objects' Equatable conformance needs to be based on something other than hashValue, unless you are certain that hashes cannot collide.
This is the exact reason that objects that conform to Hashable must also conform to Equatable. Swift needs a more domain-specific comparison method for when hashing doesn't cut it.
In that same NSHipster article, you can see how Mattt implements isEqual: versus hash in his example Person class. Specifically, he has an isEqualToPerson: method that checks against other properties of a person (birthdate, full name) to determine equality.
- (BOOL)isEqualToPerson:(Person *)person {
if (!person) {
return NO;
}
BOOL haveEqualNames = (!self.name && !person.name) || [self.name isEqualToString:person.name];
BOOL haveEqualBirthdays = (!self.birthday && !person.birthday) || [self.birthday isEqualToDate:person.birthday];
return haveEqualNames && haveEqualBirthdays;
}
He does not use a hash value when checking for equality - he uses properties specific to his person class.
Likewise, Swift does not let you simply use a Hashable object as a dictionary key -- implicitly, by protocol inheritance -- keys must conform to Equatable as well. For standard library Swift types this has already been taken care of, but for your custom types and class, you must create your own == implementation. This is why Swift does not automatically handle dictionary collisions with custom types - you must implement Equatable yourself!
As a parting thought, Mattt also states that you can often just do an identity check to make sure your two objects are at different memory address, and thus different objects. In Swift, that would like like this:
if person1 === person2 {
// ...
}
There is no guarantee here that person1 and person2 have different properties, just that they occupy separate space in memory. Conversely, in the earlier isEqualToPerson: method, there is no guarantee that two people with the same names and birthdates are actually same people. Thus, you have to consider what makes sense for you specific object type. Again, another reason that Swift does not implement Equatable for you on custom types.
the equal hash values cause the collision1 key to be overwritten by
the collision2 key. There is no warning. If such a collision only
happened once in a dictionary with 100 keys, then it could easily be
missed.
Hash collision has nothing to do with it. (Hash collisions never affect the result, only the performance.) It is working exactly as documented.
Dictionary operations work on equality (==) of keys. Dictionaries do not contain duplicate keys (meaning keys that are equal). When you set a value with a key, it overwrites any entry containing an equal key. When you get an entry with a subscript, it finds a value with a key that is equal to, not necessarily the same as, the thing you gave. And so on.
collision1 and collision2 are equal (==), based on the way you defined the == operator. Therefore, setting an entry with key collision2 must overwrite any entry with key collision1.
P.S. The same exact thing applies with dictionaries in other languages. For example, in Cocoa, NSDictionary does not allow duplicate keys, i.e. keys that are isEqual:. In Java, Maps do not allow duplicate keys, i.e. keys that are .equals().
You can see my comments on this page's answers and this answer. I think all answers are still written in a VERY confusing way.
tl;dr 0) you don't need to write the implementation isEqual ie == between the hashValues. 1) Only provide/return hashValue. 2) just implement Equatable as you normally would
0) To conform to hashable you must have a computed value named hashValue and give it an appropriate value. Unlike equatable protocol, the comparison of hashValues is already there. You DON'T need to write:
func ==(lhs: MyStructure, rhs: MyStructure) -> Bool {
return lhs.hashValue == rhs.hashValue
// Snippet A
}
1) Then it uses the hashValue to check to see if for that hashValue's index (calculated by its modulo against the array's count) the key being looked exists or not. It looks within the array of key/value pairs of that index.
2) However as a fail safe ie in case there are matching hashes you fall back to the regular == func. (Logically you need it because of a fail safe. But you also you need it because Hashable protocol conforms to Equatable and therefore you must write an implementation for ==. Otherwise you would get a compiler error)
func == (lhs: MyStructure, rhs: MyStructure) -> Bool {
return lhs.id == rhs.id
//Snippet B
}
Conclusion:
You must include Snippet B, exclude Snippet A, while also having a hashValue property