Here's an example of two different dictionaries, yet they return the same hash code. Why?
https://gist.github.com/837861
(They aren't the same object)
Hashes aren't guaranteed to be distinct for distinct objects. In fact, hash collisions will happen. The only two properties the -hash method is supposed to guarantee are (both taken from the documentation):
If two objects are equal (as determined by the isEqual: method), they must have the same hash value.
If a mutable object is added to a collection that uses hash values to determine the object’s position in the collection, the value returned by the hash method of the object must not change while the object is in the collection.
If you look here, you can see that the hash implementation on dictionaries simply returns the count and is likely the reason why you're getting the same code:
https://stackoverflow.com/a/11984624/59198
Related
Recently, in version R2022b they announced the introduction of dictionaries.
I was under the impression that dictionaries were already available, provided by containers.Map. Are dictionaries just a different name mapped to containers.Map? Or are there other differences? I was unable to find anything comparing them online.
From what I can gather, after reading this blog post and the comments under it, and the documentation (I haven’t yet had a chance to experiment with them, so feel free to correct me if I’m wrong):
dictionary is an actual primitive type, like double, cell or struct. containers.Map is a “custom class”, even if nowadays the code is built-in, the functionality can never be as integrated as for a primitive type. Consequently, dictionary is significantly faster.
dictionary uses normal value semantics. If you make a copy you have two independent dictionaries (note MATLAB’s lazy copy mechanism). containers.Map is a handle class, meaning that all copies point to the same data, modifying one copy modifies them all.
containers.Map can use char arrays (the old string format) or numbers as keys (string is implicitly converted to char when used as key). dictionary can use any type, as long as it overloads keyhash. This means you can use your own custom class objects as keys.
dictionary is vectorized, you can look up multiple values at once. With a containers.Map you can look up multiple values using the values function, not the normal lookup syntax.
dictionary has actual O(1) lookup. If I remember correctly, containers.Map doesn’t.*
containers.Map can store any array as value, dictionary stores only scalars. The scalar can be a cell, which can contain any array, but this leads to awkward semantics, since retrieving the value retrieves the cell, not its contents.
* No, it is also O(1), at least in R2022b.
When I reading the book Advanced Swift and in the Chapter 'Hashable Requirement', I got confused by this explanation
two instances that are equal (as defined by your == implementation) must have the same hash value. The reverse isn’t true: two instances with the same hash value don’t necessarily compare equally.
How can I comprehend the 'reverse' situation, or why do the two instances with the same hash value don’t necessarily compare equally.
Think of the hash value as a quick, compact, non-unique identifier for a given object instance. The only hard condition is this: if two objects compare equally, according to the == operator, than both instances must have the exact same hash value. That’s all there is to it ;)
In particular, given that hash values aren’t unique — and how they could be given Int limited range? — we can’t safely assume that two instances with the same hash value will compare equally.
Does Swift have an ordered set type? And if not, what are my options if I want to use one?
The standard library's Set is unordered, as is made clear in the documentation:
Arrays are ordered collections of values. Sets are unordered collections of unique values. Dictionaries are unordered collections of key-value associations.
However, many data structures suitable for implementing ordered sets (and dictionaries) are known, in particular balanced binary trees such as Red-Black trees.
As an example of this, c++'s stl has ordered sets and maps, and allows range queries on them using lower and upper bounds.
I know that a set's members can be sorted in to an array, but I am after a data structure with O(log(n)) insertion, removal and query.
Swift does not have a native ordered set type. If you use Foundation, you can use NSOrderedSet in Swift. If not, you have the opportunity to write your own ordered set data structure.
Update: Swift Package Manager includes an OrderedSet implementation that may be useful. It wraps both an array and a set and manages access to get ordered set behavior.
Update #2: Apple's Swift Collections repository contains an ordered set implementation.
On April 6th, 2021, a new package of Swift was released: Swift-Collection where three more data structures have been implemented. (OrderedSet, OrderedDictionary, Deque)
However, this package is in its pre-1.0 release state. As a result, it might not be stable.
Swift blog: Release of Swift Collections
At the time being there is no ordered set in Swift. Despite using NSOrderedSet on all Apple platforms, you can simply combine a Set with an Array to basically get the same effect. The Set is used to avoid duplicate entries, the Array is used to store the order. Before adding a new element to the Array, check if it is in the Set already. When removing an element from the Array, also remove it from the Set. To check if an element exists, ask the Set, it's faster. To retrieve an element by index, use the Array. You can change the order of elements in the Array (e.g. resort it) without having to touch the Set at all. To iterate over all elements, use the Array as that way the elements are iterated in order.
I am setting a managedObject up from data I am getting off the web, before I add this new object to the managedObjectContext I want to check if its all ready in the database. Is there a way to compare two managed objects in one hit, or do I have to compare each attribute individually to work out if they are identical or one contains a difference?
Simple Example:
Entity:Pet (Created but not inserted into database)
Attribute, Name: Brian
Attribute, Type: Cat
Attribute, Age: 12
Entity:Pet (Currently in database)
Attribute, Name: Brian
Attribute, Type: Cat
Attribute, Age: 7
In this example can I compare [Brian, Cat, 12] with [Brian, Cat, 7] or do I need to go through each attribute one by one to ascertain a full match?
Unique identifiers are often used to search for objects by only having to match the one field. As you note, matching on multiple fields could be annoying and inefficient, but it's perhaps not as bad as you think: you can construct an NSPredicate to quite easily match all the required fields on objects in Core Data.
Use of NSPredicate aside: suppose you just want to match one field. If you don't have a suitable unique identifier in the data as provided, you could derive one. The obvious way is to construct a hash code for everything you store, based on each field you want to match on. Then when you wish to check if an 'incoming' object is already in core data, compute the hash code for the new object, then just look for an object in core data with that same hash code. (Note: if you find an object that already exists with the same hash code, you might want to then compare all the fields to check that it really does represent the same object -- there's a tiny chance it might be a 'different' object, A.K.A. a hash collision).
A very naive hash code implementation for an object X would be something like:
hashcode(X) = hashcode(X.name) + hashcode(X.type) + hashcode(X.age)
To see a more realistic example of writing a hashcode function, see the accepted answer here.
By the way, I'm assuming that you don't want to load all your objects from core data into memory at once. If however that is acceptable (suppose you have quite a limited amount of items), an alternative is to implement isEqual and hash on your class, and then use regular foundation class methods like NSArray indexOfObject: (or, even better, NSDictionary objectForKey:) to locate objects of interest.
While Adding the data into the collection, which is better practice to use, and what is performance Impact if we use Dictionary vs ArrayList and Why?
You should actually not use ArrayList at all, as you have the strongly typed List<T> to use.
Which you use depends on how you need to access the data. The List stores a sequential list of items, while a Dictionary stores items identified by a key. (You can still read the items from the Dictionary sequentially, but the order is not preserved.)
The performance is pretty much the same, both uses arrays internally to store the actual data. When they reach their capacity they allocate a new larger array and copies the data to it. If you know how large the collection will get, you should specify the capacity when you create it, so that it doesn't have to resize itself.
They are not interchangeable classes. Apples and oranges. If you intend to look up items in the collection by a key, use Dictionary. Otherwise, use ArrayList (or preferably List<T>)