I have some basic questions about core data (which I am new to) and I would like some points of view on current standards and implementations.
Basically I have an app on the iPhone (supporting iOS 3.0 and above) which gets a lot of data from web calls over HTTP, Im looking at moving the results into local storage for fast retrieval for the next time the user might load the same data again (the data doesnt change, which is why I can rely on the cached version be accurate).
I just wanted to know a few things first:
Do people these days treat the managed objects that extend NSManagedObject as domain objects, or do you create seperate classes strictly for storage and create helper methods to create them into domain objects? I sometimes find keep all persistence logic out of the domain to be a good thing.
What about clean up? How does one typically delete all the data when the app closes, or perhaps, expire data in the local storage? I certainly dont want to hold the data on the users phone at all times.
Is there any type of atomicity with Core Data? My implementation will first check for data locally before hitting the web services, I would like to make sure that there is never half a dataset being committed to the local storage and get funny results.
I would like to run a fair few background threads to fetch data in the background, are there any things I would need to consider when persisting objects on a background thread?
In relation to the above question, what is the best way to create a "background fetching" loop? In the app delegate? Per view, depending on the view? etc...?
I hope these are not too basic :)
Thanks for any help you can give.
Do people these days treat the managed
objects that extend NSManagedObject as
domain objects, or do you create
seperate classes strictly for storage
and create helper methods to create
them into domain objects? I sometimes
find keep all persistence logic out of
the domain to be a good thing.
If you create totally independent domain objects, you have the cost of keeping them in sync with your Core Data model, and keeping the translation between core data and these objects working - plus you have duplicate objects in memory, depending on how many objects you have this might be a concern.
However the benefit side of using separate domain objects is that you are no longer wedded to a managed object context. One case where something like that can hurt you is if you maintain references to managed objects and then some background operation causes the main managed object context to remove objects - now if you access any property in the deleted managed object, you trigger a fault exception (even if you have explicitly had the object loaded with no faulted data).
One thing I have tried with moderate success is occasional very lightweight separate data objects for specific uses - what I did was to define a protocol to represent the data object accessors, with the same names as the core data accessors. Then I had both the core data object and a custom standalone data object implement this protocol, and had a mechanism to automatically copy properties from one to the other. So I didn't do every object as custom, and could treat objects either coming from the local store or standalone interchangeably.
I don't have a clear preference on this one yet but lean to using the managed objects directly, because of the lack of duplication. You can mitigate bad side effects by listening for changes or using the core data controller class.
One thing that helps to keep domain objects and data objects sort of the same yet not, is using mogenerator to generate data objects. It generates very nice object representations of the objects in your core data store, plus front-end objects you are meant to edit - adding custom accessors, or complex methods to. On changing the data store mogenerator regenerates the data object but leaves your custom code alone.
http://rentzsch.github.com/mogenerator/
What about clean up? How does one
typically delete all the data when the
app closes, or perhaps, expire data in
the local storage? I certainly dont
want to hold the data on the users
phone at all times.
The data is generally small enough that I just leave it there, with an expiration timestamp for use so that you know when the data is too old to use directly. There is a ton of value to keeping data around since users close and reopen applications so frequently, and with data already there you can present results instantly while still fetching content updates.
Is there any type of atomicity with
Core Data? My implementation will
first check for data locally before
hitting the web services, I would like
to make sure that there is never half
a dataset being committed to the local
storage and get funny results.
The atomicity comes in that you perform operations in a context and then tell the context to save. So true atomicity means avoiding other components issuing a save before you are ready, which generally means doing something in its own context and merging back into a master context.
I would like to run a fair few
background threads to fetch data in
the background, are there any things I
would need to consider when persisting
objects on a background thread?
Every background thread needs its own context, you should listen for the save notification and merge into the master context at that time.
You should strive mightily to avoid duplicate requests that might be saving to the same objects nearly at the same time, this can sometimes cause core data errors on merge. Related to that - set a merge policy on your main context as the default policy is to throw an exception.
That also means that in doing modeling, go for as many separate objects as you possibly can rather than one large object that aggregates data from a lot of different sources.
For more information on saving and merging into other contexts, see this question:
CoreData and mergeChangesFromContextDidSaveNotification
In relation to the above question,
what is the best way to create a
"background fetching" loop? In the app
delegate? Per view, depending on the
view? etc...?
I like to do this from a separate singleton class (after all, the AppDelegate itself is a singleton...), that I can ask for the main managed object context in addition to a context specific to a thread.
This is also useful in that when starting a new Core Data project, you don't have to use the Core Data template and can just re-use this core data manager.
Related
Can people give me examples of why they would use coreData in an application?
I ask this because most apps are just clients to a central server where an API of some sort gives you the information you need.
In my case I'm writing a timesheet application for a web app which has an API and I'm debating if there is any value in replicating the data structure on my server in core data(Sqlite)
e.g
Project has many timesheets
employee has many timesheets
It seems to me that I can just connect to the API on every call for lists of projects or existing timesheets for example.
I realize for some kind of offline mode you could store locally in core data but this creates way more problems because you now have a big problem with syncing that data back to the web server when you get connection again.. e.g. the project selected for a timesheet no longer exists.
Can any experienced developer shed some light on there experiences on when core data is best practice approach?
EDIT
I realise of course there is value in storing local persistance but the key value of user defaults seems to cover most applications I can think of.
You shouldn't think of CoreData simply as an SQLite database. It's not JUST an SQLite database. Sure, SQLite is an option, but there are other options as well, such as in-memory and, as of iOS5, a whole slew of custom data stores. The biggest benefit with CoreData is persistence, obviously. But even if you are using an in-memory data store, you get the benefits of a very well structured object graph, and all of the heavy lifting with regards to pulling information out of or putting information into the data store is handled by CoreData for you, without you necessarily needing to concern yourself with what is backing that data store. Sure, today you don't care too much about persistence, so you could use an in-memory data store. What happens if tomorrow, or in a month, or a year, you decide to add a feature that would really benefit from persistence? With CoreData, you simply change or add a persistent data store, and all of your methods to get information out or in remain unchanged. The overhead for that sort of addition is minimal in comparison to if you were trying to access SQLite or some other data store directly. IMHO, that's the biggest benefit: abstraction. And, in essence, abstraction is one of the most powerful things behind OOP. Granted, building the Data Model just for in-memory storage could be overkill for your app, depending on how involved the app is. But, just as a side note, you may want to consider what is faster: Requesting information from your web service every time you want to perform some action, or requesting the information once, storing it in memory, and acting on that stored value for the remainder of the session. An in-memory data store wouldn't persistent beyond that particular session.
Additionally, with CoreData you get a lot of other great features like saving, fetching, and undo-redo.
There are basically two kinds of apps. Those that provide you with local functionality (games, professional applications, navigation systems...) and those that grant access to a remote service.
Your app seems to be in the second category. If you access remote services, your users will want to access new or real-time data (you don't want to read 2 week old Facebook posts) but in some cases, local caching makes sense (e.g. reading your mails when you're on the train with unstable network).
I assume that the value of accessing cached entries when not connected to a network is pretty low for your customers (internal or external) compared to the importance of accessing real-time-data. So local storage might be not necessary at all.
If you don't have hundreds of entries in your timetable, "normal" serialization (NSCoding-protocol) might be enough. If you only access some "dashboard-data", you will be able to get along with simple request/response-caching (NSURLCache can do a lot of things...).
Core Data does make more sense if you have complex data structures which should be synchronized with a server. This adds a lot of synchronization logic to your project as well as complexity from Core Data integration (concurrency, thread-safety, in-app-conflicts...).
If you want to create a "client"-app with a server driven user experience, local storage is not necessary at all so my suggestion is: Keep it as simple as possible unless there is a real need for offline storage.
It's ideal for if you want to store data locally on the phone.
Seriously though, if you can't see a need for it for your timesheet app, then don't worry about it and don't use it.
Solving the sync problems that you would have with an "offline" mode would be detailed in your design of your app. For example - don't allow projects to be deleted. Why would you? Wouldn't you want to go back in time and look at previous data for particular projects? Instead just have a marker on the project to show it as inactive and a date/time that it was made inactive. If the data that is being synced from the device is for that project and is before the date/time that it was marked as inactive, then it's fine to sync. Otherwise display a message and the user will have to sort it.
It depends purely on your application's design whether you need to store some data locally or not, if it is a real problem or a thin GUI client around your web service. Apart from "offline" mode the other reason to cache server data on client side might be to take traffic load from your server. Just think what does it mean for your server to send every time the whole timesheet data to the client, or just the changes. Yes, it means more implementation on both side, but in some cases it has serious advantages.
EDIT: example added
You have 1000 records per user in your timesheet application and one record is cca 1 kbyte. In this case every time a user starts your application, it has to fetch ~1Mbyte data from your server. If you cache the data locally, the server can tell you that let's say two records were updated since your last update, so you'll have to download only 2 kbyte. Now you should scale up this for several tens of thousands of user and you will immediately notice the difference of the server bandwidth and CPU usage.
We've got some data coming into our app. Sometimes it will be saved, so we've made an entity and a NSManagedObject subclass for it. Usually, though, the objects will be instantiated and never saved. I'm thinking of using another persistent store, with the NSInMemoryStoreType, as a staging area, then moving the ones we want to save into the sqlite store. Is that possible/sensible?
If it is, I'd like to clear out the staging area every so often. Is there a way to clear out just the objects assigned to the memory store?
You should read this lengthy blog post on temporary Core Data objects. It's very insightful.
http://www.cimgf.com/2011/08/08/transient-entities-and-core-data/
Can you not use the 'scratch pad' / Undo Management properties of core data?
http://developer.apple.com/library/mac/documentation/cocoa/conceptual/coredata/Articles/cdUsingMOs.html#//apple_ref/doc/uid/TP40001803-207821-TPXREF148
I come to iPhone programming from a web development paradigm and am having a bit of a problem understanding how to design my iPhone application.
The crux of my question is: How much data do you load into your model and when do you load it with data from the database?
In the web apps I've created, the objects on the server-side are filled by the database based off form values supplied by each request. Take the example of a simple list. You click a list value, the id for the list is sent to the server (query string), the server loads an object for just that list item, server-side code uses the object, and then destroys it before the page is returned to the user.
With iPhone apps (or I guess any app where objects persist), you could load all the list item objects into a singleton dictionary from the database before the user ever interacts with them. Then you never have to go back to the database when the user clicks on a link. You just load the object from the dictionary.
Alternatively, you could design it like a web app and just go back to the database each time and fill the object with the data requested.
Can you give me any guidance on when to use one way over the other? When do I load the data? I'm tempted to just load a bunch of data when the application starts up so that I never have to go back to the database. But this feels dirty.
For static data that isn't too large, loading it all at startup works.
In one of our products, we do this for simplicity on one of the tables (we don't expect more than a few thousand rows) and load the other table lazily (high-res images). This is a reasonable option if you don't have background threads also accessing the database.
Core Data does batched lazy loading (i.e. it will load a batch of result rows at once).
Sidenote: Writes using Core Data and an SQLite store seem exceptionally slow, to the extent that we moved processing to a background thread to avoid blocking the UI (and this is for not very much data at all) and gained some annoying concurrency issues as a result. Sigh.
I am writing an app for iOS that uses data provided by a web service. I am using core data for local storage and persistence of the data, so that some core set of the data is available to the user if the web is not reachable.
In building this app, I've been reading lots of posts about core data. While there seems to be lots out there on the mechanics of doing this, I've seen less on the general principles/patterns for this.
I am wondering if there are some good references out there for a recommended interaction model.
For example, the user will be able to create new objects on the app. Lets say the user creates a new employee object, the user will typically create it, update it and then save it. I've seen recommendations that updates each of these steps to the server --> when the user creates it, when the user makes changes to the fields. And if the user cancels at the end, a delete is sent to the server. Another different recommendation for the same operation is to keep everything locally, and only send the complete update to the server when the user saves.
This example aside, I am curious if there are some general recommendations/patterns on how to handle CRUD operations and ensure they are sync'd between the webserver and coredata.
Thanks much.
I think the best approach in the case you mention is to store data only locally until the point the user commits the adding of the new record. Sending every field edit to the server is somewhat excessive.
A general idiom of iPhone apps is that there isn't such a thing as "Save". The user generally will expect things to be committed at some sensible point, but it isn't presented to the user as saving per se.
So, for example, imagine you have a UI that lets the user edit some sort of record that will be saved to local core data and also be sent to the server. At the point the user exits the UI for creating a new record, they will perhaps hit a button called "Done" (N.B. not usually called "Save"). At the point they hit "Done", you'll want to kick off a core data write and also start a push to the remote server. The server pus h won't necessarily hog the UI or make them wait till it completes -- it's nicer to allow them to continue using the app -- but it is happening. If the update push to server failed, you might want to signal it to the user or do something appropriate.
A good question to ask yourself when planning the granularity of writes to core data and/or a remote server is: what would happen if the app crashed out, or the phone ran out of power, at any particular spots in the app? How much loss of data could possibly occur? Good apps lower the risk of data loss and can re-launch in a very similar state to what they were previously in after being exited for whatever reason.
Be prepared to tear your hair out quite a bit. I've been working on this, and the problem is that the Core Data samples are quite simple. The minute you move to a complex model and you try to use the NSFetchedResultsController and its delegate, you bump into all sorts of problems with using multiple contexts.
I use one to populate data from your webservice in a background "block", and a second for the tableview to use - you'll most likely end up using a tableview for a master list and a detail view.
Brush up on using blocks in Cocoa if you want to keep your app responsive whilst receiving or sending data to/from a server.
You might want to read about 'transactions' - which is basically the grouping of multiple actions/changes as a single atomic action/change. This helps avoid partial saves that might result in inconsistent data on server.
Ultimately, this is a very big topic - especially if server data is shared across multiple clients. At the simplest, you would want to decide on basic policies. Does last save win? Is there some notion of remotely held locks on objects in server data store? How is conflict resolved, when two clients are, say, editing the same property of the same object?
With respect to how things are done on the iPhone, I would agree with occulus that "Done" provides a natural point for persisting changes to server (in a separate thread).
I am trying to write a Core Data application for the iPhone that uses an external data source. I'm not really using Core Data to persist my objects but rather for the object life-cycle management. I have a pretty good idea on how to use Core Data for local data, but have run into a few issues with remote data. I'll just use Flickr's API as an example.
The first thing is that if I need say, a list of the recent photos, I need to grab them from an external data source. After I've retrieved the list, it seems like I should iterate and create managed objects for each photo. At this point, I can continue in my code and use the standard Core Data API to set up a fetch request and retrieve a subset of photos about, say, dogs.
But what if I then want to continue and retrieve a list of the user's photos? Since there's a possibility that these two data sets might intersect, do I have to perform a fetch request on the existing data, update what's already there, and then insert the new objects?
--
In the older pattern, I would simply have separate data structures for each of these data sets and access them appropriately. A recentPhotos set and and a usersPhotos set. But since the general pattern of Core Data seems to be to use one managed object context, it seems (I could be wrong) that I have to merge my data with the main pool of data. But that seems like a lot of overhead just to grab a list of photos. Should I create a separate managed object context for the different set? Should Core Data even be used here?
I think that what I find appealing about Core Data is that before (for a web service) I would make a request for certain data and either filter it in the request or filter it in code and produce a list I would use. With Core Data, I can just get list of objects, add them to my pool (updating old objects as necessary), and then query against it. One problem, I can see with this approach, however, is that if objects are externally deleted, I can't know, since I'm keeping my old data.
Am I way off base here? Are there any patterns people follow for dealing with remote data and Core Data? :) I've found a few posts of people saying they've done it, and that it works for them, but little in the way of examples. Thanks.
You might try a combination of two things. This strategy will give you an interface where you get the results of a NSFetchRequest twice: Once synchronously, and once again when data has been loaded from the network.
Create your own subclass of
NSFetchRequest that takes an additional block property to
execute when the fetch is finished.
This is for your asynchronous
request to the network. Let's call
it FLRFetchRequest
Create a class to which you pass
this request. Let's call it
FLRPhotoManager. FLRPhotoManager has a method executeFetchRequest: which takes an
instance of the FLRFetchRequest and...
Queues your network request based on the fetch request and passes along the retained fetch request to be processed again when the network request is finished.
Executes the fetch request against your CoreData cache and immediately returns the results.
Now when the network request finishes, update your core data cache with the network data, run the fetch request again against the cache, and this time, pull the block from the FLRFetchRequest and pass the results of this fetch request into the block, completing the second phase.
This is the best pattern I have come up with, but like you, I'm interested in other's opinions.
It seems to me that your first instincts are right: you should use fetchrequests to update your existing store. The approach I used for an importer was the following: get a list of all the files that are eligible for importing and store it somewhere. I'm assuming here that getting that list is fast and lightweight (just a name and an url or unique id), but that really importing something will take a bit more time and effort and the user may quit the program or want to do something else before all the importing is done.
Then, on a separate background thread (this is not as hard as it sounds thanks to NSRunLoop and NSTimer, google on "Core Data: Efficiently Importing Data"), get the first item of that list, get the object from Flickr or wherever and search for it in the Core Data database (carefully read Apple's Predicate Programming Guide on setting up efficient, cached NSFetchRequests). If the remote object already lives in Core Data, update the information as necessary, if not insert. When that is done, remove the item from the to-be-imported list and move on to the next one.
As for the problem of objects that have been deleted in the remote store, there are two solutions: periodic syncing or lazy, on-demand syncing. Does importing a photo from Flickr mean importing the original thing and all its metadata (I don't know what the policy is regarding ownership etc) or do you just want to import a thumbnail and some info?
If you store everything locally, you could just run a check every few days or weeks to see if everything in your local store is present remotely as well: if not, the user may decide to keep the photo anyway or delete it.
If you only store thumbnails or previews, then you will need to connect to Flickr each time the user wants to see the full picture. If it has been deleted, you can then inform the user and delete it locally as well, or mark it as not being accessible any more.
For a situation like this you could use Cocoa's archiving facilities to save the photo objects (and an index) to disk between sessions, and just overwrite it all every time the app calls home to Flickr.
But since you're already using Core Data, and like the features it provides, why not modify your data model to include a "source" or "callType" attribute? At the moment you're implicitly creating a bunch of objects with source "Flickr API", but you can just as easily treat the different API calls as unique sources and then store that explicitly.
To handle deletion, the simplest way would be to clear the data store each time it's refreshed. Otherwise you'd need to iterate over everything and only delete the photo objects with filenames that weren't included in the new results.
I'm planning to do something similar to this myself so I hope this helps.
PS: If you're not storing the photo objects between sessions at all, you could just use two different contexts and query them separately. As long as they're never saved, and the central store doesn't have anything in it already, it would work just like you describe.