Optimal way to persist an object graph to flash on the iPhone - iphone

I have an object graph in Objective-C on the iPhone platform that I wish to persist to flash when closing the app. The graph has about 100k-200k objects and contains many loops (by design). I need to be able to read/write this graph as quickly as possible.
So far I have tried using NSCoder. This not only struggles with the loops but also takes an age and a significant amount of memory to persist the graph - possibly because an XML document is used under the covers. I have also used an SQLite database but stepping through that many rows also takes a significant amount of time.
I have considered using Core-Data but fear I will suffer the same issues as SQLite or NSCoder as I believe the backing stores to core-data will work in the same way.
So is there any other way I can handle the persistence of this object graph in a lightweight way - ideally I'd like something like Java's serialization? I've been thinking of trying Tokyo Cabinet or writing the memory occupied by bunch of C structs out to disk - but that's going to be a lot of rewrite work.

I would reccomend re-writing as c structs. I know it will be a pain, but not only will it be quick to write to disk but should perform much better.
Before anyone gets upset, I am not saying people should always use structs, but there are some situations where this is actually better for performance. Especially if you pre-allocate your memory in say 20k contiguous blocks at a time (with pointers into the block), rather than creating/allocating lots of little chunks within a repeated loop.
ie if your loop continually allocates objects, that is going to slow it down. If you have preallocated 1000 structs and just have an array of pointers (or a single pointer) then this is a large magnitude faster.
(I have had situations where even my desktop mac was too slow and did not have enough memory to cope with those millions of objects being created in a row)

Rather than rolling your own, I'd highly recommend taking another look at Core Data. Core Data was designed from the ground up for persisting object graphs. An NSCoder-based archive, like the one you describe, requires you to have the entire object graph in memory and all writes are atomic. Core Data brings objects in and out of memory as needed, and can only write the part of your graph that has changed to disk (via SQLite).
If you read the Core Data Programming Guide or their tutorial guide, you can see that they've put a lot of thought into performance optimizations. If you follow Apple's recommendations (which can seem counterintuitive, like their suggestion to denormalize your data structures at some points), you can squeeze a lot more performance out of your data model than you'd expect. I've seen benchmarks where Core Data handily beat hand-tuned SQLite for data access within databases of the size you're looking at.
On the iPhone, you also have some memory advantages when using controlling the batch size of fetches and a very nice helper class in NSFetchedResultsController.
It shouldn't take that long to build up a proof-of-principle Core Data implementation of your graph to compare it to your existing data storage methods.

Related

Core Data NSFetchedResultsController Performance Advantages Over NSArray?

Does using an NSFetchedResultsController provide any performance advantages on the iPhone over an NSArray?
I have between 4,000 and 8,000 records stored in core data and wanted to know if I should select one over the other. Is the NSFetchedResultsController just used to make code 'prettier'?
My concern is searching, and lag on keyboard button presses (as well as the issue of loading that many records into memory). Thanks!
Given your parameters, Core Data will be faster than an array especially if you make any changes to the data.
The disadvantage of an array in this case is that you have to load the entire array into memory in one go.
It might seem obvious that Core Data will be slower than more primitive methods but owing to fine tuned optimization plus the ease of integration with the rest of the API it is actually fairly hard to beat Core Data in real world apps with significant amounts of data.

Should I use Core Data or SQLite to represent a graph structure with weighted edges?

I am building an iPhone application that will be using a weighted graph (probably around 1000 nodes and 500-700 relationships, ball park estimate, could be quite wrong).
Core data would be wonderful since it's easy to work with and is optimized for the iPhone. At the same time, the graph nodes need to be weighted. I could add a level of indirection (edit/clarification: I would represent the weighted edges as relationships.) in core data but feel that might defeat the point of the increase in performance.
SQLite would enable me to do it without any extra indirection the problem becomes performance.
Another consideration is that I use core data to store the stuff and then store only the weights in sqlite in tables and build custom core data objects that when I fetch or create creates the necessary links in sqlite and then just use the sqlite for calculation (since the calculation would only be traversing and looking at weights, so I don't need anything but the final result/id of the object). I then would look up the actually object in core data.
Also, there will be a server that is taking in consideration all of the users graphs into a much larger graph. So it might be easier to use sqlite straight up since replicating the information on the beastly server would be easier :-)
Thanks for the help.
At first glance, Core Data is wonderful for your use case because its an object graph management framework (with persistence). It's lazy loading and caching features allow you to load a partial graph in memory, which is usually desired if you have a large graph.
Core Data doesn't seem to support weighted relationship out of box (I'm not an Core Data expert at all). If "a level of indirection" means Core Data objects representing relationships, I agree it will be a performance hit.
For performance and portability considerations, straight SQLite may be a better choice because you have complete control over the schemas and indices.
I vote for your idea of "store only the weights in sqlite". In my understanding, this allows you to take advantage of Core Data features while keeping performance critical tasks efficient.
The author of NetNewsWire has written a blog article explaining why he has switched away from Core Data (mostly because performance and flexibility reasons) which may be helpful for you. http://inessential.com/2010/02/26/on_switching_away_from_core_data

What's the best way to store static data in an iOS app?

I have in my app a considerable amount of data that it needs to access, but will never be changed by the app. Currently I'm using this data in other applications in JSON files and SQL databases, but neither seems very straightforward to use in iOS.
I don't want to use CoreData, which provides tons of unnecessary functionality and complexity.
Would it be a good idea store the data in PropertyList file and build an accessor class? Are there any simple ways to incorporate SQLite without going the CoreData route?
You can only use plist if the amount of data is relatively small. Plist are entirely loaded into memory so you can only really use them if you can sustain all the objects created by the plist in memory at once for as long as you need them.
Core Data has a learning curve but in use it is usually less complex than SQL. In most cases the "simpler" SQL leads to more coding because you end up having to duplicate much of the functionality of Core Data to shoehorn the procedural SQL into the object-oriented API. You have to manually manage the memory use of all the data by tracking retention. You've write a lot of SQL code every time you want data. I've updated several apps from SQL to Core Data and in all cases the Core Data implementation was smaller and cleaner than the SQL.
Neither is the memory or processor "overhead" any larger. Core Data is highly optimized. In most cases, off the shelf Core Data is more efficient than hand tuned SQL. One minor sub optimization in SQL usually destroys any theoretical advantage it might have.
Of course, if you're already highly skilled at managing SQL in C then you personally might get the app to market more quickly by using SQL. However, if you're wondering what you should plan to use in general on on Apple Platforms, Core Data is almost always the answer and you should take the time to learn it.
You can just use SQLite directly without the overhead of Core Data using the SQLite C API.
Here is a tutorial I found on your use-case - simply loading some data from an SQLite database. Hope this helps.
Depending on the type of your data, the size and how often it changes, you may desire to just keep things simple and use a property list. Otherwise, using SQLite (documented in Jergason's answer) would be where I'd go. Though let me say that if you have a relatively small (less than a couple hundred) set of basic types (arrays, dictionaries, numbers, strings) that don't change frequently, then a property list will be a better choice in my opinion.
As an example to that, in one of my games, I create the levels from a single property list per difficulty. Since there are only a handful of levels per difficulty (99) and a small set of parameters for each (number of elements in play, their initial positions, mass, etc) then it makes sense, and I avoid having to deal with SQLite directly or worse yet, setting up and maintaining CoreData.
What do you mean by "best"? What kind of data?
If it's a bunch of objects, then JSON or (binary) plist aren't terrible formats, since you'll want the whole thing loaded in memory to walk the object graph. Compare space efficiency and loading performance to pick which one to use.
If it's a bunch of binary blobs, then store the blobs in a big file, memory-map the file (NSDataReadingMapped a.k.a. NSMappedRead), and use indexes into the blobs. iOS frameworks use a mixture of these (e.g. there are a lot of .pngs, but also "other.artwork" which just contains raw image data).
You can also use NSKeyedArchiver and friends if your classes implement the NSCoding protocol, but there's some object graph management overhead and the plist format it produces isn't exactly nice to work with.

Decide which caching startegy to use?

I want to cache my loaded data so that I can reduce my application start time .
I know several strategies to store application data
i.e. core data, nsuserdefaults , archiving .
Now my scenario is that suppose that I have array of maximum 10 objects each object having 5 fields .
So I can not decide which strategy to store this array an later retrieving the same .
Thanks .
Never store cache data in NSUserDefaults; that is not what it is for.
Archiving is expensive and should not be used. It is also far more difficult to manage.
Core Data is almost always the right answer unless the data storage is trivial.
Update
Archiving, also known as serializing is one of the most expensive ways to write data to disk compared to other formats. The exact details are difficult to explain in an answer here but it boils down to an old design that does not perform nearly as well as the more modern persistence systems such as Core Data. Putting the two side by side you will see significant performance increases (due to internal threading, caching, database support on the backend, etc.) when using Core Data.
The fact that Core Data also handles your data model life cycle and structure is just gravy on top of that performance increase.

Obj-C circular buffer object, implementing one?

I've been developing for the iPhone for quite some time and I've been wondering if there's any array object that uses circular buffer in Obj-C? Like Java's Stack or List or Queue.
I've been tinkering with the NSMutableArray, testing it's limits... and it seems that after 50k simple objects inside the array - the application is significantly slowed down.
So, is there any better solution other than the NSMutableArray (which becomes very slow with huge amounts of data). If not, can anyone tell me about a way to create such an object (would that involve using chain (node) objects??).
Bottom line: Populating a UITableView from an SQLite DB directly would be smart? As it won't require memory from an array or anything, but just the queries. And SQLite is fast and not memory grinding.
Thank you very much for you time and attention,
~ Natanavra.
From what I've been thinking it seems that going for Quinn's class is the best option possibly.
I have another question - would it be faster or smarter to load everything straight from the SQLite DB instead of creating an object and pushing it into an array?
Thank you in advance,
~ Natanavra.
Apologies for tooting my own horn, but I implemented a C-based circular buffer in CHDataStructures. (Specifically, check out CHCircularBufferQueue and CHCircularBufferStack.) The project is open source and has benchmarks which demonstrate that a true circular buffer is quite fast when compared to NSMutableArray in the general case, but results will depend on your data and usage, as well as the fact that you're operating on a memory-constrained device (e.g. iPhone). Hope that helps!
If you're seeing performance issues, measure where your app is spending its time, don't just guess. Apple provides an excellent set of performance measurement tools.
It's trivial to have NSMutable array act like a stack, list, queue etc using the various insertObject:atIndex: and removeObjectAtIndex: methods. You can write your own subclasses if you want to hardwire the behavior.
I doubt the performance problems you are seeing are being caused by NSMutableArray especially if your point of reference is the much, much slower Java. The problem is most likely the iPhone itself. As noted previously, 50,000 objective-c objects is not a trivial amount of data in this context and the iPhone hardware may struggle to managed that much data.
If you need some kind of high performance array for bytes, you could use one of the core foundation arrays or roll your own in plain C and then wrap them in a custom class.
It sounds to me like you need to switch to core data so you don't have to keep all this in memory. Core data will efficiently fetch what you want only when you need it.
You can use STL classes in "Objective-C++" - which is a fancy name for Objective-C making use of C++ classes. Just name those source files that use C++ code with a ".mm" extension and you'll get the mixed runtime.
Objective-C objects are not really "simple," so 50,000 of them is going to be pretty demanding. Write your own in straight C or C++ if you want to avoid the bottlenecks and resource demands of the Objective-C runtime.
A rather lengthy and non-theoretical discussion of the overhead associated with convenience:
http://www.cocoabuilder.com/archive/cocoa/35145-nsarray-overhead-question.html#35128
And some simple math for simple people:
All it takes to make an object as opposed to a struct is a single pointer at the beginning.
Let's say that's true, and let's say we're running on a 32-bit system with 4 byte pointers.
4 bytes x 50,000 objects = 200000 bytes
That's nearly 200MB worth of extra memory that your data suddenly needs just because you used Objective-C. Now compound that with the fact that whatever NSArray you add those objects to is going to double that by keeping its own set of pointers to those objects and you've just chewed up 400MB of RAM just so you could use a couple of convenience wrappers.
Refresh my memory here... Are swap files on hard drives as fast as RAM? How much RAM is there in an iPhone? How many function calls and stack frames does it take to send an object a message? Why isn't IOKit written in Objective-C? How many of Apple's flagship applications that do a lot of DSP use AppKit? Anybody got a copy of otool they can check with? I'm seeing zero here.