Creating a JSON Store For iPhone - iphone

We have loads of apps where we fetch data from remote web services as JSON and then use a parser to translate that into a Core-Data model.
For one of our apps, I'm thinking we should do something different.
This app has read-only data, which is volatile and therefore not cached locally for very long. The JSON is deeply hierarchal with tons of nested "objects". Documents usually contain no more than 20 top level items, but could be up to 100K.
I don't think I want to create a Core Data model with 100's of entities, and then use a mapper to import the JSON into it. It's seems like such a song and dance. I think I just want to persist the JSON somewhere easy, and have the ability to query it. MongoDB would be fine, if it ran on iPhone.
Is there a JSON document store on the iPhone that supports querying?
Or, can I use some JSON parser to convert the data to some kind of persistent NSDictionary and query that using predicates?
Or perhaps use SQLite as a BLOB store with manually created indexes on the JSON structures?
Or, should I stop whining, and use Core Data? :)
Help appreciated.

When deciding what persistence to use, it's important to remember that Core Data is first and foremost an object graph management system. It true function is to create the runtime model layer of Model-View-Controller design patterned apps. Persistence is actually a secondary and even optional function of Core Data.
The major modeling/persistence concerns are the size of the data and the complexity of the data. So, the relative strengths and weaknesses of each type of persistence would break down like this:
_______________________________
| | |
2 | | |
| SQL | Core Data | 4
s | | |
i |_______________ ______________|
z | | |
e | | |
1 | Collection | Core Data | 3
| plist/xml | |
| | |
-------------------------------
Complexity--->
To which we could add a third lessor dimension, volatility i.e. how often the data changes
(1) If the size, complexity and volatility of the data are low, then using a collection e.g. NSArray, NSDictionary, NSSet of a serialized custom object would be the best option. Collections must be read entirely into memory so that limits their effective persistence size. They have no complexity management and all changes require rewriting the entire persistence file.
(2) If the size is very large but the complexity is low then SQL or other database API can give superior performance. E.g. an old fashion library index card system. Each card is identical, the cards have no relationships between themselves and the cards have no behaviors. SQL or other procedural DBs are very good at processing large amounts of low complexity information. If the data is simple, then SQL can handle even highly volatile data efficiently. If the UI is equally simple, then there is little overhead in integrating the UI into the object oriented design of an iOS/MacOS app.
(3) As the data grows more complex Core Data quickly becomes superior. The "managed" part of "managed objects" manages complexity in relationships and behaviors. With collections or SQL, you have manually manage complexity and can find yourself quickly swamped. In fact, I have seen people trying manage complex data with SQL who end up writing their own miniature Core Data stack. Needless to say, when you combine complexity with volatility Core Data is even better because it handles the side effects of insertions and deletion automatically.
(Complexity of the interface is also a concern. SQL can handle a large, static singular table but when you add in hierarchies of tables in which can change on the fly, SQL becomes a nightmare. Core Data, NSFetchedResultsController and UITableViewController/delegates make it trivial.)
(4) With high complexity and high size, Core Data is clearly the superior choice. Core Data is highly optimized so that increase in graph size don't bog things down as much as they do with SQL. You also get highly intelligent caching.
Also, don't confuse, "I understand SQL thoroughly but not Core Data," with "Core Data has a high overhead." It really doesn't. Even when Core Data isn't the cheapest way to get data in and out of persistence, it's integration with the rest of the API usually produces superior results when you factor in speed of development and reliability.
In this particular case, I can't tell from the description whether you are in case (2) or case (4). It depends on the internal complexity of the data AND the complexity of the UI. You say:
I don't think I want to create a Core
Data model with 100's of entities, and
then use a mapper to import the JSON
into it.
Do you mean actual abstract entities here or just managed objects? Remember, entities are to managed objects what classes are to instances. If the former, then yes Core Data will be a lot of work up front, if the latter, then it won't be. You can build up very large complex graphs with just two or three related entities.
Remember also that you can use configuration to put different entities into different stores even if they all share a single context at runtime. This can let you put temporary info into one store, use it like more persistent data and then delete the store when you are done with it.
Core Data gives you more options than might be apparent at first glance.

I use SBJson to parse JSON to NSDictionaries then save them as .plist files using [dict writeToFile:saveFilePath atomically:YES]. Loading is also just as simple NSMutableDictionary *dict = [NSDictionary dictionaryWithContentsOfFile:saveFilePath]. Its fast, efficient and easy. No need for a database.

JSON Framework is one. It'll turn your JSON into native NSDictionary and NSArray objects. I don't know anything about its performance on a large document like that, but lots of people use it and like it. It's not the only JSON library for iOS, but it's a popular one.

Related

can we use core data to store a real estate property?

I need to know what should I use to store the property?
I'm still confused. Should I use SQLit or core data?
I will have a lot of data, user data and real estate data. It will also appear in the map.
Core Data contains sqlite as a storage facility. It's not an actual database, but rather a graph database.
SQLite on the other hand is a database. Different methodology, can be used for small chunks of data (I use it extensively) efficiently and mostly to implement a pure RDBMS system (with primary/foreign key, unions, and stuff like this, SQL powered).
For Core Data, you use graph relationships (which means objects connected to other objects by references).
The outcome is the same, different programming though, depending on the complexity you want to have (or actually have designed) in your app, so lay down your plans to see which one suits you. For SQLite I recommend FMDB wrapper, since it's easier instead of doing 2 to 3 checks for every SQL statement.
Your app seems interesting, and since it's real estate based you might want to spice it up a little bit later with a small technology called "augmented reality" :)

Caching EAV data - XML or NoSQL / MongoDB?

I'm building a web app that relies heavily on the EAV pattern for storing data. This basically means that each attribute of an object has it's own row in a massive database table. I'm using MySQL to store everything. This is a very simplified example of what I'm storing...
OBJECTS ATTRIBUTES
objId | type objId | attribute | value
============= =========================
1 | fruit 1 | color | green
2 | fruit 1 | shape | round
3 | book 2 | color | red
I know some people hate EAV, but I need to be able to add new object attributes arbitrarily without modifying the database schema, and it's working very well for me so far.
As I think anyone else finds when building a system using an EAV data structure, the weakness of this approach is the retrieval of multiple objects together with each object's attributes. At the moment my app only displays 10 objects at a time, so I just query my EAV table 10 times (once for each object) and it's still very fast. However, I'd like to remove this limitation and allow hundreds of objects to be fetched in one go. I also want to be able to query objects in a more flexible way than I'm doing currently.
Doing this with SQL joins would be hideous, so I'm considering caching the data. On average the database gets about 300 reads for every 1 write, so I think this it's a good candidate for caching.
So far these are the options I've come up with...
XML database column: Every time a write is performed, update an XML text column in the objects table containing all the object's attributes. This would work for reading the data quickly, but querying XML data hidden in a database table is messy.
XML file: Every time a write is performed, write an XML file to disk which contains each object and it's attributes. This has the benefit that I can then use XQuery to query the objects.
NoSQL (eg. MongoDB): Perhaps I should have built the system on a schemaless database like MongoDB. Re-writing the entire app to use MongoDB would be quite time consuming, but it struck me that I could use it as a cache. So for example, every time data is written to the EAV store, the equivalent object would be updated in MongoDB which would then be used for reads and queries.
Originally I thought an XML file would be the best approach, but I can see the file getting really big and unmanageable. At the moment I'm leaning towards using MongoDB. I know it seems crazy running two database servers for one app, but I think it could work in my case.
I'd love to hear your thoughts on this.
I see only two ways, both of them were mentioned in comments.
First, you can really migrate to document-oriented db like Mongo - this is suitable as alternative to EAV. Since it'll be no JOINs and other logic, it'll be very fast and slightly scaled. (So, perhaps you'll be able to avoid using cache).
Second, you can use specific tool for caching like Redis or Mongo or Memcached to save every query result for some time.
But I want to turn our mind to the future of this system. What is planned loading and scaling?
If you want to reduce system load, I think the best way is to migrate to document-oriented db.
Or, if you want to have result immediately (cache data for reading) - it can be reached by using caching tool, even [if possible] on network level (for example nginx support memcached out of the box).
So, as usual, you should find balance between one-time and continious costs.

What's the best way to store static data in an iOS app?

I have in my app a considerable amount of data that it needs to access, but will never be changed by the app. Currently I'm using this data in other applications in JSON files and SQL databases, but neither seems very straightforward to use in iOS.
I don't want to use CoreData, which provides tons of unnecessary functionality and complexity.
Would it be a good idea store the data in PropertyList file and build an accessor class? Are there any simple ways to incorporate SQLite without going the CoreData route?
You can only use plist if the amount of data is relatively small. Plist are entirely loaded into memory so you can only really use them if you can sustain all the objects created by the plist in memory at once for as long as you need them.
Core Data has a learning curve but in use it is usually less complex than SQL. In most cases the "simpler" SQL leads to more coding because you end up having to duplicate much of the functionality of Core Data to shoehorn the procedural SQL into the object-oriented API. You have to manually manage the memory use of all the data by tracking retention. You've write a lot of SQL code every time you want data. I've updated several apps from SQL to Core Data and in all cases the Core Data implementation was smaller and cleaner than the SQL.
Neither is the memory or processor "overhead" any larger. Core Data is highly optimized. In most cases, off the shelf Core Data is more efficient than hand tuned SQL. One minor sub optimization in SQL usually destroys any theoretical advantage it might have.
Of course, if you're already highly skilled at managing SQL in C then you personally might get the app to market more quickly by using SQL. However, if you're wondering what you should plan to use in general on on Apple Platforms, Core Data is almost always the answer and you should take the time to learn it.
You can just use SQLite directly without the overhead of Core Data using the SQLite C API.
Here is a tutorial I found on your use-case - simply loading some data from an SQLite database. Hope this helps.
Depending on the type of your data, the size and how often it changes, you may desire to just keep things simple and use a property list. Otherwise, using SQLite (documented in Jergason's answer) would be where I'd go. Though let me say that if you have a relatively small (less than a couple hundred) set of basic types (arrays, dictionaries, numbers, strings) that don't change frequently, then a property list will be a better choice in my opinion.
As an example to that, in one of my games, I create the levels from a single property list per difficulty. Since there are only a handful of levels per difficulty (99) and a small set of parameters for each (number of elements in play, their initial positions, mass, etc) then it makes sense, and I avoid having to deal with SQLite directly or worse yet, setting up and maintaining CoreData.
What do you mean by "best"? What kind of data?
If it's a bunch of objects, then JSON or (binary) plist aren't terrible formats, since you'll want the whole thing loaded in memory to walk the object graph. Compare space efficiency and loading performance to pick which one to use.
If it's a bunch of binary blobs, then store the blobs in a big file, memory-map the file (NSDataReadingMapped a.k.a. NSMappedRead), and use indexes into the blobs. iOS frameworks use a mixture of these (e.g. there are a lot of .pngs, but also "other.artwork" which just contains raw image data).
You can also use NSKeyedArchiver and friends if your classes implement the NSCoding protocol, but there's some object graph management overhead and the plist format it produces isn't exactly nice to work with.

What type of data storage should I use if I have a list of data that contains 100 objects and each object has its own data?

My plan is to display a list of items alphabetically in a table view that has about 100 items. Each item has an image, a list of times and a description that the tableview will drill down to. What I am struggling with is the correct way to store and load this data. Some have told me that a plist will be too data heavy and that core data is too new. Should I just create arrays?
You're not clear about what you intend to do with this data. Plists and Core Data are both persistence formats (on disk). Arrays are an in-memory format (and can also be slapped onto disk, I suppose, if that's what you want to do, but inventing your own binary disk format is only something you should consider very rarely, and certainly not in the case you probably have).
In memory, you can probably just use an array (NSArray) and have each element perhaps be an NSDictionary of the other properties relative to that entry. That sounds like the model of your MVC design, which you can then hook up to the table view.
As far as persisting this to disk, it depends on whether 100 items is a fixed amount, a ballpark, or a minimum, etc. Plists (see NSKeyedArchiver) are great for all the data except possibly the raw image data-- you might want to keep those "to the side" as separate image files with filenames in the plist.
I don't know much about Core Data, but it's not that new, and it's not untested, so if it does what you want without much hassle, go for it.
Serialize it into an Archive using NSCoding Protocol. See Guide.
I'd use an NSArray of business objects implementing NSCoding and then just archive them.
I usually default to Core Data unless I have a compelling reason not to. (Of course, I have learned Core Data so that makes it easy for me to this.)
Core Data has the following advantages:
It has an editor in which you can create complex object graphs easily
It can generate custom classes for you data model objects.
The generated classes are easily modified.
Core Data manages the complexity of adding, deleting and saving objects.
Core Data makes persisting an object graph almost invisible.
NSFetchedResultsController is custom designed to provide data for tables.
100 objects is a small graph that Core Data can handle easily. It's a lot easier to use Core Data than it is to write custom coders to archive custom objects. For exmaple, at present, I have an app with over a dozen major entities each with two or three relationships to other entities. Hand coding all that would be a nightmare.
Core Data has something of a steep learning curve especially if you've never worked with object graphs before but if you're planning on writing a lot of Apple platform software, learning it is well worth the time.

Cocoa Touch Data Persistence

I'm experimenting with Core Data, plist files, flat files and sqlite.
I can't seem to differentiate in terms of efficiency for small data sets.
In terms of the differences on the surface ( i.e the API ), i know the difference.
But what I'm trying to get a feel for is which persistence model is best for which situation.
For small data sets, if you need read - write capability, you should go with NSUserDefaults - if gives you the power of key-value store and retrieval without too much hassle.
If you need read-only access, plist files are a viable option, as it keeps the abstraction to the concept of key-value and offers an accessible API to work with.
Flat files would be recommended if you need a different model of persistence than key-value, otherwise it would mean just reinventing the wheel.
Sqlite would fit the case where your data is organized in a strong relational manner and instead of key-value, you'd rather prefer having the power of sql to work directly with your data.
If for your dataset, however small it may be, would be an unnecessary inconvenience to manage the low-level storage and retrieval, then you could choose CoreData. With CoreData, code can retrieve and manipulate data on a purely object level without having to worry about the details of storage and retrieval, so you'd be more focused on your domain logic rather than fitting it to the storage and data manipulation logic.