Preventing duplicate NSArrays (one sorted, one unsorted)

Preventing duplicate NSArrays (one sorted, one unsorted) - iphone

I have an NSArray of objects, it is unordered. It's updated from the internet so I can't order it once and guarantee it's order later.
Each item corresponds to a table view cell, but the user can reorder these, and there are 2 sections in the tableview, although to begin with all the cells are in one section.
So I could create a duplicate NSArray and have this ordered. And save it to the hard drive.
But this seems quite a waste, and what do I do when a new object is added to/removed from the unordered NSArray by an update over the internet.
So in sum:
How do i have one unordered NSArray and one ordered without duplication (and waste of memory)?
And how do I deal with updates to the unordered array, when the user hasn't set a location for new objects/cells?

Your question is not very clear:
Arrays are an "ordered collection" by definition (apart from the odd languages with "associative arrays").
"Ordering" is not the same as "sorting". [1, 2, 3] is sorted; [1, 3, 2] is not sorted, but it is ordered (assuming the usual comparator).
"Preventing duplicate NSArrays" is an unhelpful subject.
So I'll have to guess at what you're trying to say:
You have a list of things that you download from the internet.
You occasionally update the list of things downloaded.
The user can reorder the list.
You want to be able to update the list of things but preserve the user's ordering.
Well, first you need some way of finding out which items in the two lists are "equivalent". For example, the first list is [Apple, Banana, Orange] and the user puts it in juice-preference order [Orange, Apple, Banana]. If the second list is [apple, banana, orange] (because you decided that everything should be lowercase, and things that needed to be capitalized could be done with -[NSString capitalizedString], or whatever), you need a way of determining that Apple = apple and constructing a new list [orange, apple, banana].
It's not clear why you think you have to save the original list — yes, it means that you can just say "Apple and apple are both at index 0, so they're the same", but this also means that you can never change the default order, and you can never remove an item (you can replace it with a placeholder, but meh).
There are two easy solutions:
Keep a list of indices (e.g. you'd store [2, 0, 1] because Orange is at index 2, etc.). This is a bit of a pain. If you want to use NSArray, the easiest way is to wrap things in NSNumber.
Don't care about the original ordering. Let's say you have a "user-ordered" list [Orange, Apple, Banana] and a new "server-ordered" list [apple, grape, orange]. Iterate over the user-ordered list, picking out "equivalent" items from the server-ordered list to get [orange, apple] and [grape]. Then do something sane with the "new" items, like sticking it on the end to get [orange, apple, grape]. (In this example, Banana has been removed; we handle this case by simply not including it in the new list.)

If they're both referencing the same objects (and not copies), then the extra memory will be very minimal (NSArray overhead and pointers), and probably not worth worrying about.

Related

NSFetchedResultsController predicate to eliminate duplicates of several properties

I am using an NSFetchedResultsController in my UITableViewController.
Is it possible to specify a predicate that will not retrieve items which have duplicate fields in x number of fields that I specify.
For example, I want to search all results for items but if the itemName AND itemDescription AND itemQuantity are the same, I want only one of these items.

Option 1
When the page loads do a single run through the data and keep a list of objectID that are duplicate. For duplicate object set the row height of the cell to be 0. So they are technically still there, but you can't see it. This make dealing with the NSFetchedResultsControllerDelegate calls easy because no indexPaths have changed
Option 2
If the dataset is always selected in the same way and an object that is a duplicate is always a duplicate you can set an 'isDuplicate' in the object and filter it out in the predicate. Or you can not store at all in the first place. If objects are displayed in different sets and in different way and sometime should be displayed and sometime not be displayed this is not a good solution
Option 3
If you are sorting by the same criteria that make an object duplicate (that is duplicates always appear right next to a non-duplicate) and you are NOT using sections, then you can use sectionKeyPath. SectionKeyPath groups items together into sections. Group the duplicate and non duplicate together and then display every section as a single row (use the first item in each section). The indexPaths of the fetchedResultsController will not match the indexPaths of the tableview so you have to careful to convert them.
Option 4
Instead of accessing the objects from a fetchedResultsController do a fetch and and filter the array. Then use the array to display the objects. The downside is that you don't get updates on when objects change. This can be especially problematic is objects are deleted, as accessing a managedObject that's entity was delete can lead to a crash.
I recommend option 1

Geofire TableView - CircleQuery Users for leaderboard [duplicate]

I'm trying to figure out how to query with filter with Geofire.
Suppose I have restaurants with different category. and I want to add that category to my query. How do I go about this?
One way I have now is querying the key with Geofire, run the for loop through each key and get the restaurant, and insert the appropriate restaurant to the array.
These seems so inefficient. Is there any other way to go about this?
Ideally I will have the filtered results, and only load each item when they're about to be shown.
Cheers!

Firebase queries can only filter by one condition. Geofire already does quite some "magic" to allow it to filter on both longitude and latitude. Adding another property to that equation might be possible, but is well beyond what Geofire handles by default. See GeoFire: How to add extra conditions within the query?
If you only ever want to access one category at a time, you can put the restaurants in a top-level node per category and point Geofire to one category.
/category1
item1
g: "pns0h0mf2u"
l: [-53.435719, 140.808716]
item2
g: "u417k3dwub"
l: [56.83069, 1.94822]
/category2
item3
g: "8m3rz3s480"
l: [30.902225, -166.66809]
/items
item1: ...
item2: ...
item3: ...
In the above example, we have two categories: category1 with 2 items and category2 with just 1 item. For each item, we see the data that Geofire uses: a geohash and the longitude and latitude. We also keep a single list with the other properties of these 3 items.
But more commonly, you simply do the extra filtering in client-side code. If you're worried about the performance of that: measure it, share the code, JSON data and measurements.

This is an old question, but I've seen it in a few places on the web, so I thought I might share one trick I've used.
The Problem
If you have a large collection in your database, maybe containing hundreds of thousands of keys, for example, it might not be feasible to grab them all. If you're trying to filter results based on location in addition to other criteria, you're stuck with something like:
Execute the location query
Loop through each returned geofire key and grab the corresponding data in the database
Check each returned piece of data to see if it matches the other criteria
Unfortunately, that's a lot of network requests, which is quite slow.
More concretely, let's say we want to get all users within e.g. 100 miles of a particular location that are male and between ages 20 and 25. If there are 10,000 users within 100 miles, that means 10,000 network requests to grab the user data and compare their gender and age.
The Workaround:
You can store the data you need for your comparisons in the geofire key itself, separated by a delimiter. Then, you can just split the keys returned by the geofire query to get access to the data. You still have to filter through them, but it's much faster than sending hundreds or thousands of requests.
For instance, you could use the format:
UserID*gender*age, which might look something like facebook:1234567*male*24. The important points are
Separate data points by a delimiter
Use a valid character for the delimiter -- "It can include any unicode characters except for . $ # [ ] / and ASCII control characters 0-31 and 127.)"
Use a character that is not going to be found elsewhere in your database - I used *, but that might not work for you. Do not use any characters from -0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz, since those are fair-game for keys generated by firebase's push()
Choose a consistent order for the data - in this case, UserID first, then gender, then age.
You can store up to 768 bytes of data in firebase keys, which goes a long way.
Hope this helps!

Most efficient way to store nested categories (or hierarchical data) in Mongo?

We have nested categories for several products (e.g., Sports -> Basketball -> Men's, Sports -> Tennis -> Women's ) and are using Mongo instead of MySQL.
We know how to store nested categories in a SQL database like MySQL, but would appreciate any advice on what to do for Mongo. The operation we need to optimize for is quickly finding all products in one category or subcategory, which could be nested several layers below a root category (e.g., all products in the Men's Basketball category or all products in the Women's Tennis category).
This Mongo doc suggests one approach, but it says it doesn't work well when operations are needed for subtrees, which we need (since categories can reach multiple levels).
Any suggestions on the best way to efficiently store and search nested categories of arbitrary depth?

The first thing you want to decide is exactly what kind of tree you will use.
The big thing to consider is your data and access patterns. You have already stated that 90% of all your work will be querying and by the sounds of it (e-commerce) updates will only be run by administrators, most likely rarely.
So you want a schema that gives you the power of querying quickly on child through a path, i.e.: Sports -> Basketball -> Men's, Sports -> Tennis -> Women's, and doesn't really need to truly scale to updates.
As you so rightly pointed out MongoDB does have a good documentation page for this: https://docs.mongodb.com/manual/applications/data-models-tree-structures/ whereby 10gen actually state different models and schema methods for trees and describes the main ups and downs of them.
The one that should catch the eye if you are looking to query easily is materialised paths: https://docs.mongodb.com/manual/tutorial/model-tree-structures-with-materialized-paths/
This is a very interesting method to build up trees since to query on the example you gave above into "Womens" in "Tennis" you could simply do a pre-fixed regex (which can use the index: http://docs.mongodb.org/manual/reference/operator/regex/ ) like so:
db.products.find({category: /^Sports,Tennis,Womens[,]/})
to find all products listed under a certain path of your tree.
Unfortunately this model is really bad at updating, if you move a category or change its name you have to update all products and there could be thousands of products under one category.
A better method would be to house a cat_id on the product and then separate the categories into a separate collection with the schema:
{
_id: ObjectId(),
name: 'Women\'s',
path: 'Sports,Tennis,Womens',
normed_name: 'all_special_chars_and_spaces_and_case_senstive_letters_taken_out_like_this'
}
So now your queries only involve the categories collection which should make them much smaller and more performant. The exception to this is when you delete a category, the products will still need touching.
So an example of changing "Tennis" to "Badmin":
db.categories.update({path:/^Sports,Tennis[,]/}).forEach(function(doc){
doc.path = doc.path.replace(/,Tennis/, ",Badmin");
db.categories.save(doc);
});
Unfortunately MongoDB provides no in-query document reflection at the moment so you do have to pull them out client side which is a little annoying, however hopefully it shouldn't result in too many categories being brought back.
And this is basically how it works really. It is a bit of a pain to update but the power of being able to query instantly on any path using an index is more fitting for your scenario I believe.
Of course the added benefit is that this schema is compatible with nested set models: http://en.wikipedia.org/wiki/Nested_set_model which I have found time and time again are just awesome for e-commerce sites, for example, Tennis might be under both "Sports" and "Leisure" and you want multiple paths depending on where the user came from.
The schema for materialised paths easily supports this by just adding another path, that simple.
Hope it makes sense, quite a long one there.

If all categories are distinct then think of them as tags. The hierarchy isn't necessary to encode in the items because you don't need them when you query for items. The hierarchy is a presentational thing. Tag each item with all the categories in it's path, so "Sport > Baseball > Shoes" could be saved as {..., categories: ["sport", "baseball", "shoes"], ...}. If you want all items in the "Sport" category, search for {categories: "sport"}, if you want just the shoes, search for {tags: "shoes"}.
This doesn't capture the hierarchy, but if you think about it that doesn't matter. If the categories are distinct, the hierarchy doesn't help you when you query for items. There will be no other "baseball", so when you search for that you will only get things below the "baseball" level in the hierarchy.
My suggestion relies on categories being distinct, and I guess they aren't in your current model. However, there's no reason why you can't make them distinct. You've probably chosen to use the strings you display on the page as category names in the database. If you instead use symbolic names like "sport" or "womens_shoes" and use a lookup table to find the string to display on the page (this will also save you hours of work if the name of a category ever changes -- and it will make translating the site easier, if you would ever need to do that) you can easily make sure that they are distinct because they don't have anything to do with what is displayed on the page. So if you have two "Shoes" in the hierarchy (for example "Tennis > Women's > Shoes" and "Tennis > Men's > Shoes") you can just add a qualifier to make them distinct (for example "womens_shoes" and "mens_shoes", or "tennis_womens_shoes") The symbolic names are arbitrary and can be anything, you could even use numbers and just use the next number in the sequence every time you add a category.

Alphabetically sectioned UITableView, NSSortPredicate Vs Array of arrays?

I have an array of simple objects. I wish to display these in a sectioned table view sorted alphabetically, the first section being "A", the second being "B", etc. The data-source of this table, i.e. the simple array, may be updated frequently (same every ten minutes).
I'm trying to figure out if its better to have an two dimensional array with each sub-array corresponding to a character in the alphabet / populating a section. Or use predicates to get the objects for each section & sort them alphabetically?
I'm leaning towards the multi-dimensional array approach as it might be less resource intensive than doing a search & sort predicate computation for each section?

Look at the UILocalizedIndexedCollation class and its associated sample codes. It is designed especially for this purpose.
You may be interested in this doc too.

NSMutableSet vs NSMutableArray

What's the difference?
In my context, I need to be able to dynamically add to and remove objects. The user clicks on rows of a table that check on and off and thus add or remove the referenced object from the list.
A wild guess is that array has indexed items while set has no indexes?

An NSSet/NSMutableSet doesn't keep items in any particular order. An NSArray/NSMutableArray does store the items in a particular order. If you're building a table view, you should definitely use an array as your data source of choice.

Also, NSMutableSet makes sure that all objects are unique.
NSMutableArray goes well with UITableView since elements have index, so you can return [array count] to get number of table rows, or [array objectAtIndex:rowNumber] to easily associate element with row.

Also, according to the documentation, testing for object membership is faster with NSSets.
You can use sets as an alternative to arrays when the order of
elements isn’t important and performance in testing whether an object
is contained in the set is a consideration—while arrays are ordered,
testing for membership is slower than with sets.

Well there are 3 main differences. 1) Sets are unordered 2)Don't have an index & 3)Cannot have duplicate values where Arrays are ordered, indexed & can store duplicate values.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse