What is the most practical Solution to Data Management using SQLite on the iPhone?

What is the most practical Solution to Data Management using SQLite on the iPhone? - iphone

I'm developing an iPhone application and am new to Objective-C as well as SQLite. That being said, I have been struggling w/ designing a practical data management solution that is worthy of existing. Any help would be greatly appreciated.
Here's the deal:
The majority of the data my application interacts with is stored in five tables in the local SQLite database. Each table has a corresponding Class which handles initialization, hydration, dehydration, deletion, etc. for each object/row in the corresponding table. Whenever the application loads, it populates five NSMutableArrays (one for each type of object). In addition to a Primary Key, each object instance always has an ID attribute available, regardless of hydration state. In most cases it is a UUID which I can then easily reference.
Before a few days ago, I would simply access the objects via these arrays by tracking down their UUID. I would then proceed to hydrate/dehydrate them as I needed. However, some of the objects I have also maintain their own arrays which reference other object's UUIDs. In the event that I must track down one of these "child" objects via it's UUID, it becomes a bit more difficult.
In order to avoid having to enumerate through one of the previously mentioned arrays to find a "parent" object's UUID, and then proceed to find the "child's" UUID, I added a DataController w/ a singleton instance to simplify the process.
I had hoped that the DataController could provide a single access point to the local database and make things easier, but I'm not so certain that is the case. Basically, what I did is create multiple NSMutableDicationaries. Whenever the DataController is initialized, it enumerates through each of the previously mentioned NSMutableArrays maintained in the Application Delegate and creates a key/value pair in the corresponding dictionary, using the given object as the value and it's UUID as the key.
The DataController then exposes procedures that allow a client to call in w/ a desired object's UUID to retrieve a reference to the actual object. Whenever their is a request for an object, the DataController automatically hydrates the object in question and then returns it. I did this because I wanted to take control of hydration out of the client's hands to prevent dehydrating an object being referenced multiple times.
I realize that in most cases I could just make a mutable copy of the object and then if necessary replace the original object down the road, but I wanted to avoid that scenario if at all possible. I therefore added an additional dictionary to monitor what objects are hydrated at any given time using the object's UUID as the key and a fluctuating count representing the number of hydrations w/out an offset dehydration. My goal w/ this approach was to have the DataController automatically dehydrate any object once it's "hydration retainment count" hit zero, but this could easily lead to significant memory leaks as it currently relies on the caller to later call a procedure that decreases the hydration retainment count of the object. There are obviously many cases when this is just not obvious or maybe not even easily accomplished, and if only one calling object fails to do so properly I encounter the exact opposite scenario I was trying to prevent in the first place. Ironic, huh?
Anyway, I'm thinking that if I proceed w/ this approach that it will just end badly. I'm tempted to go back to the original plan but doing so makes me want to cringe and I'm sure there is a more elegant solution floating around out there. As I said before, any advice would be greatly appreciated. Thanks in advance.

I'd also be aware (as I'm sure you are) that CoreData is just around the corner, and make sure you make the right choice for the future.

Have you considered implementing this via the NSCoder interface? Not sure that it wouldn't be more trouble than it's worth, but if what you want is to extract all the data out into an in-memory object graph, and save it back later, that might be appropriate. If you're actually using SQL queries to limit the amount of in-memory data, then obviously, this wouldn't be the way to do it.

I decided to go w/ Core Data after all.

Related

Memory usage: global variables

I am writing an app that requires access to a set of information from all classes and methods.
My query is, what's the most efficient way of doing this, so as to minimise memory usage?
I have taught myself coding and have, of course, come across numerous different methods to solve this whilst trawling the internet. For example:
I could create a global variable such as var info = ..., which I'd place above a class definition, thus giving access from anywhere in the app.
Or I could create a singleton, GameData and have the data stored there, e.g. GameData.shared.info, similarly available from anywhere in the app.
Or I could load the information in the first ViewController and then pass it around as a parameter.
There are no doubt more methods that I haven't come across, but I wonder which is the most memory-efficient method, if indeed there is such a thing. In my case, I won't need access to a huge amount of data - no more than say sixty or seventy records each with half a dozen fields of text or numbers.
Many thanks

Using dependency injection, that is, passing it as a parameter to any object that requires it, would be the more memory conscious method, since the variable will stop existing "automatically" if you were to replace the whole hierarchy (as long as it's done correctly).
By using a singleton or a global variable, you would have to clear the value yourself.
If the value is not going to disappear during the lifetime of the application, then memory usage doesn't matter, but I would still advice against global variables and suggest using dependency injection.

Mongo DB Collection Versioning

Are there any best practices or ways we can use to version a collection or objects in a collection in Mongo DB?
The requirement of versioning a collection is because, the objects in the collection maybe added with new attributes going forward but the already added objects (i.e. old objects) will not be having these attributes and the values for these new attributes. So on retrieval, we need to make sure, the code is not broken in de-serializing the different versions of the same object in the collection.
I can think of adding a version attribute to the objects explicitly, but are there any better built in alternatives in Mongo DB for handling this versioning of objects and/or collections.
Thanks,
Bathiya

I guess the best approach would be to update all objects in a batch process when you start using the new software on the server, since otherwise you’ll never know when an object will be updated and you’ll need to keep the old versions of those forever around.
Another thing what I’m doing so far, and it worked (so far), to have the policy to only allow adding new properties to the object. This way in worst case the DB won’t have all the data, but that is fine with all the json-serializers I know. But this means you aren’t allowed to delete or rename properties as well as modifying their type (from scalar value to object, array; from object to scalar, array; …).
Since usually I want to store additional information instead of less, this seems like a good solution without any real limitation for me.
If you need to change the type because scalar value isn’t enough, you can still create some code around which would transform it for every object which has the old value into the new one. And if a bulk update from your side is able to perform the changes I’d still do it but sometimes it needs user input.
For instance if you used to save passwords only as md5-hash it was a scalar value. But someone told you, they should be stored as sha512 with a salt together, now you need an object field for the password, you could call it password_sha512 where you store the salt and the hashed password.

iOS: using GCD with Core Data

at the heart of it, my app will ask the user for a bunch of numbers, store them via core data, and then my app is responsible for showing the user the average of all these numbers.
So what I figure I should do is that after the user inputs a new number, I could fire up a new thread, fetch all the objects in a NSFetchDescription instance and call it on my NSManagedObjectContext, do the proper calculations, and then update the UI on the main thread.
I'm aware that the rule for concurrency in Core Data is one thread per NSManagedObjectContext instance so what I want to know is, do you I think can what I just described without having my app explode 5 months down the line? I just don't think it's necessary to instantiate a whole a new context just to do some measly calculations...

Based on what you have described, why not just store the numbers as they are entered into a CoreData model and also into an NSMutableArray? It seems as though you are storing these for future retrieval in case someone needs to look at (and maybe modify) a previous calculation. Under that scenario, there is no need to do a fetch after a current set of numbers is entered. Just use a mutable array and populate it with all the numbers for the current calculation. As a number is entered, save it to the model AND to the array. When the user is ready to see the average, do the math on the numbers in the already populated array. If the user wants to modify a previous calculation, retrieve those numbers into an array and work from there.
Bottom line is that you shouldn't need to work with multiple threads and merging Contexts unless you are populating a model from a large data set (like initial seeding of a phonebook, etc). Modifying a Context and calling save on that context is a very fast thing for such a small change as you are describing.

I would say you may want to do some testing, especially in regard to the size of the data set. if it is pretty small, the sqlite calls are pretty fast so you may get away with doing in on the main queue. But if it is going to take some time, then it would be wise to get it off the main thread.
Apple introduced the concept of parent and child managed object contexts in 2011 to make using MO contexts on different threads easier. you may want to check out the WWDC videos on Core Data.
You can use NSExpression with you fetch to get really high performance functions like min, max, average, etc. here is a good link. There are examples on SO
http://useyourloaf.com/blog/2012/01/19/core-data-queries-using-expressions.html
Good luck!

CoreData or Individual Files (iOS)

I've got a little conundrum: would it be better to use direct file management, or a CoreData SQLite database?
Here's my scenario:
I have a bunch of 'user' objects, each with a list of 'post' objects. This is easily done in CoreData, and would be great - however, the 'post' objects are downloaded from a web server, and they each have a unique identifier. I don't want to have multiple 'post' objects with the same ID. I could solve this by caching CoreData responses into an NSDictionary, however this would not apply well to the design pattern of an application. As far as I am aware, when adding a new 'post' to my CoreData NSManagedObjectContext, I would have to lookup the unique ID to check for its existence (fast), then add it if it does not exist (slow), and update the previous if it does (fast). This is effectively replacing it. How would you guys handle this?
I've been trying to think of alternatives for a few days now, but no matter which way I look at it, CoreData is going to be slower than my alternative:
A file architecture inside the Caches/ directory of an iOS application could solve the problem. Something like this:
Users/
{unique ID}.user
{unique ID}.user
Posts/
{unique ID}.post
{unique ID}.post
Then, when retrieving a post object or user object, I can check the files for the existence of the data, and cache the file contents in an NSDictionary. If the ID exists in the dictionary, retrieve it from there instead. Replacing previous 'user' and 'post' objects is as simple as overwriting the file and updating the cache.
My second alternative would clearly be faster - however, I would not be taking advantage of any efficiencies built into CoreData, and I would have to provide my own memory management scheme to clear my cached dictionaries when a memory warning occurs.
Is there is some way of 'uniquing' in CoreData? That would solve my problem. Something similar to using a primary key in an ordinary SQLite database.
I'll start doing tests to verify speeds of both methods, but I thought I'd post this up here before starting in case anyone has any better solutions.

This exact question comes up a lot.
You can check if a value exist in Core Data without reading in the entire object. Just set the fetch to fetch the specific property you want to test, the ID in this case, and then return the fetch as a dictionary. Provide a predicate that looks for one or more IDs and if the returned dictionary has values, you know you have existing objects.
It's very rare that you can end up with a custom system which is faster and more robust than Core Data. It's rarely worth even trying.
Remember as well that premature optimization is the root of all evil. All this work is predicated on the premise that the simplest Core Data implementation is to slow. Have you actually tested that it is to slow? If not, do so before you try more elaborate designs.

After testing, I've found that CoreData at least halves the amount of time taken. The test I was running was as follows: I added 1000 posts to an empty CoreData object graph; and then retrieved 100 of those objects for updating. The time taken to add the objects was 0.069s, and the time taken to retrieve the objects was 0.181s. I retrieved these values on a 3G iPad device. Using files, adding these objects took 10 times longer, and retrieving them took 4 times longer.
My recommendation: Stick to using CoreData!

How to keep track of objects deleted from an ObservableCollection in CRUD scenarios?

In our multi-tier business application we have ObservableCollections of Self-Tracking Entities that are returned from service calls.
The idea is we want to be able to get entities, add, update and remove them from the collection client side, and then send these changes to the server side, where they will be persisted to the database.
Self-Tracking Entities, as their name might suggest, track their state themselves.
When a new STE is created, it has the Added state, when you modify a property, it sets the Modified state, it can also have Deleted state but this state is not set when the entity is removed from an ObservableCollection (obviously). If you want this behavior you need to code it yourself.
In my current implementation, when an entity is removed from the ObservableCollection, I keep it in a shadow collection, so that when the ObservableCollection is sent back to the server, I can send the deleted items along, so Entity Framework knows to delete them.
Something along the lines of:
protected IDictionary<int, IList> DeletedCollections = new Dictionary<int, IList>();
protected void SubscribeDeletionHandler<TEntity>(ObservableCollection<TEntity> collection)
{
var deletedEntities = new List<TEntity>();
DeletedCollections[collection.GetHashCode()] = deletedEntities;
collection.CollectionChanged += (o, a) =>
{
if (a.OldItems != null)
{
deletedEntities.AddRange(a.OldItems.Cast<TEntity>());
}
};
}
Now if the user decides to save his changes to the server, I can get the list of removed items, and send them along:
ObservableCollection<Customer> customers = MyServiceProxy.GetCustomers();
customers.RemoveAt(0);
MyServiceProxy.UpdateCustomers(customers);
At this point the UpdateCustomers method will verify my shadow collection if any items were removed, and send them along to the server side.
This approach works fine, until you start to think about the life-cycle these shadow collections. Basically, when the ObservableCollection is garbage collected there is no way of knowing that we need to remove the shadow collection from our dictionary.
I came up with some complicated solution that basically does manual memory management in this case. I keep a WeakReference to the ObservableCollection and every few seconds I check to see if the reference is inactive, in which case I remove the shadow collection.
But this seems like a terrible solution... I hope the collective genius of StackOverflow can shed light on a better solution.
EDIT:
In the end I decided to go with subclassing the ObservableCollection. The service proxy code is generated so it was a relatively simple task to change it to return my derived type.
Thanks for all the help!

Instead of rolling your own "weak reference + poll Is it Dead, Is it Alive" logic, you could use the HttpRuntime.Cache (available from all project types, not just web projects).
Add each shadow collection to the Cache, either with a generous timeout, or a delegate that can check if the original collection is still alive (or both).
It isn't dreadfully different to your own solution, but it does use tried and trusted .Net components.
Other than that, you're looking at extending ObservableCollection and using that new class instead (which I'd imagine is no small change), or changing/wrapping UpdateCustomers method to remove the shadow collection form DeletedCollections
Sorry I can't think of anything else, but hope this helps.
BW

If replacing ObservableCollection is a possibility (e.g. if you are using a common factory for all the collections instances) then you could subclass ObservableCollection and add a Finalize method which cleans up the deleted items that belongs to this collection.
Another alternative is to change the way you compute which items are deleted. You could maintain the original collection, and give the client a shallow copy. When the collection comes back, you can compare the two to see what items are no longer present. If the collections are sorted, then the comparison can be done in linear time on the size of the collection. If they're not sorted, then the modified collection values can be put in a hash table and that used to lookup each value in the original collection. If the entities have a natural id, then using that as the key is a safe way of determining which items are not present in the returned collection, that is, have been deleted. This also runs in linear time.
Otherwise, your original solution doesn't sound that bad. In java, a WeakReference can register a callback that gets called when the reference is cleared. There is no similar feature in .NET, but using polling is a close approximation. I don't think this approach is so bad, and if it's working, then why change it?
As an aside, aren't you concerned about GetHashCode() returning the same value for distinct collections? Using a weak reference to the collection might be more appropriate as the key, then there is no chance of a collision.

I think you're on a good path, I'd consider refactoring in this situation. My experience is that in 99% of the cases the garbage collector makes memory managment awesome - almost no real work needed.
but in the 1% of the cases it takes someone to realize that they've got to up the ante and go "old school" by firming up their caching/memory management in those areas. hats off to you for realizing you're in that situation and for trying to avoid the IDispose/WeakReference tricks. I think you'll really help the next guy who works in your code.
As for getting a solution, I think you've got a good grip on the situation
-be clear when your objects need to be created
-be clear when your objects need to be destroyed
-be clear when your objects need to be pushed to the server
good luck! tell us how it goes :)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse