Does Swift have an ordered set type? And if not, what are my options if I want to use one?
The standard library's Set is unordered, as is made clear in the documentation:
Arrays are ordered collections of values. Sets are unordered collections of unique values. Dictionaries are unordered collections of key-value associations.
However, many data structures suitable for implementing ordered sets (and dictionaries) are known, in particular balanced binary trees such as Red-Black trees.
As an example of this, c++'s stl has ordered sets and maps, and allows range queries on them using lower and upper bounds.
I know that a set's members can be sorted in to an array, but I am after a data structure with O(log(n)) insertion, removal and query.
Swift does not have a native ordered set type. If you use Foundation, you can use NSOrderedSet in Swift. If not, you have the opportunity to write your own ordered set data structure.
Update: Swift Package Manager includes an OrderedSet implementation that may be useful. It wraps both an array and a set and manages access to get ordered set behavior.
Update #2: Apple's Swift Collections repository contains an ordered set implementation.
On April 6th, 2021, a new package of Swift was released: Swift-Collection where three more data structures have been implemented. (OrderedSet, OrderedDictionary, Deque)
However, this package is in its pre-1.0 release state. As a result, it might not be stable.
Swift blog: Release of Swift Collections
At the time being there is no ordered set in Swift. Despite using NSOrderedSet on all Apple platforms, you can simply combine a Set with an Array to basically get the same effect. The Set is used to avoid duplicate entries, the Array is used to store the order. Before adding a new element to the Array, check if it is in the Set already. When removing an element from the Array, also remove it from the Set. To check if an element exists, ask the Set, it's faster. To retrieve an element by index, use the Array. You can change the order of elements in the Array (e.g. resort it) without having to touch the Set at all. To iterate over all elements, use the Array as that way the elements are iterated in order.
Related
Recently, in version R2022b they announced the introduction of dictionaries.
I was under the impression that dictionaries were already available, provided by containers.Map. Are dictionaries just a different name mapped to containers.Map? Or are there other differences? I was unable to find anything comparing them online.
From what I can gather, after reading this blog post and the comments under it, and the documentation (I haven’t yet had a chance to experiment with them, so feel free to correct me if I’m wrong):
dictionary is an actual primitive type, like double, cell or struct. containers.Map is a “custom class”, even if nowadays the code is built-in, the functionality can never be as integrated as for a primitive type. Consequently, dictionary is significantly faster.
dictionary uses normal value semantics. If you make a copy you have two independent dictionaries (note MATLAB’s lazy copy mechanism). containers.Map is a handle class, meaning that all copies point to the same data, modifying one copy modifies them all.
containers.Map can use char arrays (the old string format) or numbers as keys (string is implicitly converted to char when used as key). dictionary can use any type, as long as it overloads keyhash. This means you can use your own custom class objects as keys.
dictionary is vectorized, you can look up multiple values at once. With a containers.Map you can look up multiple values using the values function, not the normal lookup syntax.
dictionary has actual O(1) lookup. If I remember correctly, containers.Map doesn’t.*
containers.Map can store any array as value, dictionary stores only scalars. The scalar can be a cell, which can contain any array, but this leads to awkward semantics, since retrieving the value retrieves the cell, not its contents.
* No, it is also O(1), at least in R2022b.
I am setting a managedObject up from data I am getting off the web, before I add this new object to the managedObjectContext I want to check if its all ready in the database. Is there a way to compare two managed objects in one hit, or do I have to compare each attribute individually to work out if they are identical or one contains a difference?
Simple Example:
Entity:Pet (Created but not inserted into database)
Attribute, Name: Brian
Attribute, Type: Cat
Attribute, Age: 12
Entity:Pet (Currently in database)
Attribute, Name: Brian
Attribute, Type: Cat
Attribute, Age: 7
In this example can I compare [Brian, Cat, 12] with [Brian, Cat, 7] or do I need to go through each attribute one by one to ascertain a full match?
Unique identifiers are often used to search for objects by only having to match the one field. As you note, matching on multiple fields could be annoying and inefficient, but it's perhaps not as bad as you think: you can construct an NSPredicate to quite easily match all the required fields on objects in Core Data.
Use of NSPredicate aside: suppose you just want to match one field. If you don't have a suitable unique identifier in the data as provided, you could derive one. The obvious way is to construct a hash code for everything you store, based on each field you want to match on. Then when you wish to check if an 'incoming' object is already in core data, compute the hash code for the new object, then just look for an object in core data with that same hash code. (Note: if you find an object that already exists with the same hash code, you might want to then compare all the fields to check that it really does represent the same object -- there's a tiny chance it might be a 'different' object, A.K.A. a hash collision).
A very naive hash code implementation for an object X would be something like:
hashcode(X) = hashcode(X.name) + hashcode(X.type) + hashcode(X.age)
To see a more realistic example of writing a hashcode function, see the accepted answer here.
By the way, I'm assuming that you don't want to load all your objects from core data into memory at once. If however that is acceptable (suppose you have quite a limited amount of items), an alternative is to implement isEqual and hash on your class, and then use regular foundation class methods like NSArray indexOfObject: (or, even better, NSDictionary objectForKey:) to locate objects of interest.
I have a set of objects that the user can sort arbitrarily. I would like to make my client remember the sorting of the set of objects so that when the user visits the page again the ordering he/she chose will be preserved. However, the client-side framework should also be able to quickly lookup the objects from whatever array/hashmap they are stored in based upon the ordering. What is the most efficient way of doing this?
The best way I have found for doing this is using an array that stores the IDs of the array in the particular order I wanted. From there, I can access the array of objects I wanted by converting the array to a hashmap using Underscore.js.
I have an NSMutableArray which has about 18 objects. They are in a specific order that I want. I have to add these objects into an NSSet to be saved in Core Data.
But, once I pull them out of the NSSet using [myObject.items allObjects] it does not keep the original order that I added the objects as. How can I keep the order that I want, I don't want to have to resort them.
Sets do not have sort order because sets have no concept of order. Arrays have a fixed order because the order of elements in an array is what defines an array. A set by contrast is defined by the uniqueness of each individual object in a set. An array can have many duplicates of the same element at different indexes but a set can never store the same object twice.
Core Data uses sets because it needs to know exactly which objects relate to one another. In most cases, any ordering e.g. alphabetical, numerical etc is needed only by the UI and plays no part in the actual modeling of the real-world objects, events or conditions the data model simulates. For example, if you had a model of Department<-->>Employee, all employees belonging to a particular department would comprise a set. You might need to sort employees by name, date of birth, hire date etc but that would be just for display and would have nothing to do with the relationship.
If you need to model any kind of arbitrary order, you need to add an attribute that holds an attribute that you can sort on as needed. This is not overhead because the order becomes part of the real-world objects, events or conditions simulated by the model.
If you don't want to sort them you can add a property (e.g. int indexInList) to the objects which stands for the position in the list.
But sorting the list in respect to a property of the objects would be very easy too with
- (void)sortUsingSelector:(SEL)comparator
- (void)sortUsingDescriptors:(NSArray *)sortDescriptors
...
While Adding the data into the collection, which is better practice to use, and what is performance Impact if we use Dictionary vs ArrayList and Why?
You should actually not use ArrayList at all, as you have the strongly typed List<T> to use.
Which you use depends on how you need to access the data. The List stores a sequential list of items, while a Dictionary stores items identified by a key. (You can still read the items from the Dictionary sequentially, but the order is not preserved.)
The performance is pretty much the same, both uses arrays internally to store the actual data. When they reach their capacity they allocate a new larger array and copies the data to it. If you know how large the collection will get, you should specify the capacity when you create it, so that it doesn't have to resize itself.
They are not interchangeable classes. Apples and oranges. If you intend to look up items in the collection by a key, use Dictionary. Otherwise, use ArrayList (or preferably List<T>)