Efficient storage of large amounts of data in iOS - iphone

I'm building an application which has a "record" feature which records user interaction over time. As time progresses, I fill an array in memory with "state" objects representing the current state of the user input. A typical recording will result in about 5k of these objects.
I then archive this data using NSKeyedArchiver archiveRootObject: toFile:. This works fine, however the file size is very large (3.5 megs or so). My question is this:
Is there any inherent file-size overhead involved in archiving files? Would I be able to save this data using much less disk space if I were to use SQLite, or even roll my own file format? Or is the only way to reduce the disk size of the data going to be to reduce the bit depth of the numbers I'm storing?

If your concern is performance, Core Data gives you more granularity. You can lazy load and save by parts during app execution vs loading/saving the whole 3.5Mb object graph.
If your concern is file size, this is the binary plist format, and this is the SQLite file format. But more important than the overhead, is how complex is the translation between your object graph and the Core Data model.
You may also be interested in this comparison of speed and performance for several file formats: https://github.com/eishay/jvm-serializers/wiki/ Not sure if everything there has an C, C++ or objective-C implementation.

3.5 MB isn't a very large file. However, if your app has to load or save a 3.5 MB file all the time, then using Core Data is a lot smarter as this allows you to save only the data that has changed and retrieve only the parts that you're interested in -- not the whole thing every time.

If storage is the main concern, there would be little difference b/w sqlite and core data.
I had to store UIViewControllers with state in an app, where I ended up not saving the serialized objects but saving only the most specific properties and creating a class which read that data and re-allocated those objects.
The property map was then stored in a csv [admittedly very difficult to manage, but small like anything] and then compressed.

Related

iOS 5 Data Storage: Core Data, SQL or other options?

I am working on an application for the iPhone (iOS 5). What I have to do is create a map by using binary data that I reveive from a server. Some issues actually work quite well:
I can connect to a server, send requests and receive binary data from it
I can interprete this data, create objects (polygons and paths) from it and draw them within a view
But now it comes to the hard part. The map that I create should be zoomable and moovable. So I have to send new requests to the server and redraw the map. This also works nicely, but the data I already received now needs to be stored, because I should not request the same data from the server twice (e.g. if I zoom out and then back in).
Finally here is my question: What would be the best way to store my data? Until now I thought about using CoreData or SQLite. Are there even better solutions? And what data should I save - the binary data or my created objects?
I hope this was understandable and you can help me with at least one of my issues...
Core data is the only way to go.
Core data is not a storage system, is an object graph and persistence framework, witch can use SQlite to store data.
If you use core data you can refactor your project and use managedObjects subclass as models.
Take a look at Core Data Programming Guide, The differences between Core Data and a Database
Edit:
From Core Data Performance
Core Data is a rich and sophisticated object graph management
framework capable of dealing with large volumes of data. The SQLite
store can scale to terabyte sized databases with billions of
rows/tables/columns. Unless your entities themselves have very large
attributes (although see “Large Data Objects (BLOBs)”) or large
numbers of properties, 10,000 objects is considered to be a fairly
small size for a data set.
It really depends on the size of your data objects and how you access them. If your objects are small, you could store them in Core Data. But, if your map data is coming as images from a bunch of URLs, I would use Core Data to store the mappings to the map image URLs and use NSURLConnection to manage the caching of your objects.
I recommend reading the Apple Core Data Programming Guide Large Data Objects (BLOBs), it discusses the size and number of objects. Some excerpts are below:
The exact definition of "small", "modest", and "large" is fluid and depends on an application's usage. A loose rule of thumb is that objects in the order of kilobytes in size are of a "modest" sized and those in the order of megabytes in size are "large" sized.
For small to modest sized BLOBs (and CLOBs), you should create a separate entity for the data and create a to-one relationship in place of the attribute.
It is better, however, if you are able to store BLOBs as resources on the filesystem, and to maintain links (such as URLs or paths) to those resources. You can then load a BLOB as and when necessary.

using coredata for storing / caching non standard data types

I'm rearching the best ways to store non standard types (string, int16 etc) on the iphone.
What I will ultimately be doing is downloading an xml file and storing values such as date, title, name, mediaurl. I've just discovered the coredata data model and I believe it would be a good candidate for storing such data so I don't have to download the xml the next time the app starts.
What I'm unsure of is the limitations (if any) of what I can store in a entity. For example one of the xml elements would hold a url to a small piece of audio (less than 1mb) and a url to an image. Would it be appropriate to store audio data , image as an attribute in an entity or should it be kept to strings and ints etc and the non standard types stored else where?
I guess what I'm really asking is, is the datamodel suitable for caching?
Ultimately what I'm seeking is a solution for storing data on the device in a location that is not tied to any one view, kinda an atomic model with everything I need that I can just dip into no matter view I'm in.
The data model is suitable for caching, but because you don't have an explicit control of the cache (you can fault a data object but it may remain in the memory), it's recommended to separate very large binary objects. Store them as resources on the filesystem, and manage their links (URLs or paths) in Core Data.
< 1MB file seems okay to be handled by Core Data, but it also depends how many of them your application uses.
Also if you do store large files in Core Data, you should use SQLite storage.
The above answer from MHC is good, but if you're storing large binary objects that don't need to be indexed (which can't be done in SQLite anyway), the recommended way is to store the actual data somewhere on the file system (say, in the NSDocumentsDirectory), and store path to the file inside the Core Data entity.
Core Data loads all parts of a fetched object into memory, which for a few instances of entities with binary data could quickly cause you to run out of memory on an iOS device.
If it's stored in the filesystem, you can lazy-load the data just when you need it.

Storing large mutable arrays on iPhone

Okay, I can't seem to find a clear answer to this question of storage on the iPhone. My model class has several ivars and two very large (MB) mutable arrays of data that are collected from an external device and then analyzed. What I'm thinking is that you have data in the object (similar to a note or a music file) and you can save it to a permanent data "file", and then later open old data "file" and view it (no editing of old data will be done). Along side this I want another stored object that keeps track of a few key bits of information from each of the data files, and also has references to them (maybe the user could click a data point, and it would open the corresponding data file - if it still exists (it could be deleted by the user to save space)).
I see tons of advice recommending all data storage for iPhone apps should use Core Data. The thing is, except for the one side "file", there are no relationships between objects. The objects could be thought of as notes or music files, they don't care about the existence of each other, and there is only one object in existence ("loaded") at a time (either in memory with data being added to it, and to be saved later, or loaded from storage being viewed).
What is the best way to manage this? Currently a device controller (handles the device communication) creates the model, and sends data to it (the model parses and analyzes the data). But should there be some controller that handles files (or Core Data managedobject, whatever) that creates the object, and the saves its data, releases its memory and then loads a new one with stored data?
Any advice would be helpful, as the best storage examples I've found seem to be very relational (employee,boss,company) - which I can see would benefit from a database. But at the same time, manually keeping a list of files in a directory may be more work than some other method.
I can't find citation in Apple's documentation but I have read (and been told by Apple engineers) that "large" data objects are sometimes best stored outside of Core Data. The model that has been suggested places BLOBs (Binary Large OBjects) in file system with Core Data objects referencing these large objects (i.e., storing relative or absolute file paths).
So, assuming that your BLOBs are music data, then you might have your Core Data model have an entity that holds meta-data (e.g., size, time/duration, etc.) as well as reference to file that holds actual data. Your meta-data entity could also have relationships with other entities within your system. For example, you might store spectrograms for music data and have those held by a separate entity.
I wrestled with this issue for data that was being sampled from various measurement sensors. Ultimately, I decided that my data sets were small enough (in most cases) to store with Core Data as NSData properties of dedicated entity. The wrapping entity was 'dedicated' so as to avoid loading data just to display meta-data to user.
Update
I found line about BLOBs in Core Data Programming Guide at the end of the "Large Data Objects (BLOBs)" section:
It is better, however, if you are able
to store BLOBs as resources on the
filesystem, and to maintain links
(such as URLs or paths) to those
resources. You can then load a BLOB as
and when necessary.
I'd also recommend using Core Data. While Core Data does make it easy to handle relations, no one is preventing you from using Core Data to store unrelated information. There is no rule against creating models in Core Data that have no relation with each other what so ever; just don't link them together.
Core Data will handle all the reading/writing to the database, which will save you the trouble of having to parse your own files. There is a bit of a learning curve when trying to use Core Data for the first time, but once you get it running, you'll be thankful it's there.
Unless you need any kind of database access (fast queries, frequent updates, etc.), a database involves too much overhead, in terms of both coding and performance. Database features are not exactly free, so I can think of a lot of scenarios where you will get better results with less effort if you go with one of the following:
multiple files with a separately stored index file;
a large file of your own binary format.
The choice depends on the number of items, the size of items, whether their sizes are identical, how often you need to modify the data and/or the index, etc.

Store XML data in Core Data

is there any easy way of store XML data into core data?
Currently, my app just pulls the values from the XML file directly, however, this isn't efficient for XML files which holds over 100 entries, thus storing the data in Core Data would be the best option. XML file is called/downloaded/parsed ever time the app opens.
With the Core Data, the XML data would be downloaded ever 3600 seconds or so, and refresh the current data in the core data, to reduce the loading time when opening the app.
Any ideas on how I can do this?
Having reviewed the developer documentation, it doesn't look very tasty.
I take that you mean you have to down load an xml file, parse it and then save the data encoded in the file? You have several options for saving such data.
If the data is relatively simple and static e.g. a repeating list of items, then you might just want to use a NSArray, NSSet or NSDictionary (or some nested combination) and then just write the resulting collection to disk as a plist using the collection classes writeToFile: methods. Then when the data is needed you just use one of the initWithFile: methods. The disadvantage of this system is that you have to read the entire file back into memory to use it. This system doesn't scale for very large data sets.
If the data is complex e.g. a bunch of separate but highly interrelated chunks of data, and moderately large, then Core Data would be better.
Of course, you always have the option of writing the downloaded file straight to disk as a string if you want.

XML and SQLite memory utilization and performance on the iPhone

How do the memory utilization and performance for XML or SQLite compare on the iPhone?
The initial data set for our application is 500 records with no more than 750 characters each.
How well would XML compare with SQLite for accessing say record 397 without going through the first 396? I know SQLite3 would have a better methods for that, but how is the memory utilization?
When dealing with XML, you'll probably need to read the entire file into memory to parse it, as well as write out the entire file when you want to save. With SQLite and Core Data, you can query the database to extract only certain records, and can write only the records that have been changed or added. Additionally, Core Data makes it easy to do batched fetching.
These limited reads and writes can make your application much faster if it is using SQLite or Core Data for its data store, particularly if you take advantage of Core Data's batched fetching. As Graham says, specific numbers on performance can only be obtained by testing under your specific circumstances, but in general XML is significantly slower for all but the smallest data sets. Memory usage can also be much greater, due to the need to load and parse records you do not need at that instant.
To find out how the memory usage for your application fares, you need to measure your application :). The Instruments tool will help you.