Core Data: fetch background and using objectWithID on main thread, performance benefit?

Core Data: fetch background and using objectWithID on main thread, performance benefit? - iphone

I read Core Data Programming Guide recently and Apple suggest us to do so
You fetch in one managed object context on a background thread, and pass the object IDs of the fetched
objects to another thread. In the second thread (typically the application's main thread, so that you can
then display the results), you use the second context to fault in objects with those object IDs (you use
objectWithID: to instantiate the object). (This technique is only useful if you are using an SQLite store, > since data from binary and XML stores is read into memory immediately on open.)
To my understanding, fetch on background context will not register the managed object in main thread context, so the managed object returned from objectWithID is most likely a fault. When we using it on main thread, we will trigger a new round of trip to the SQLite Store. So the UI maybe blocked.
Did I miss anything ? Are there a way to avoid I/O overhead on main thread ?

There's not much overheard when you do a fetch in the background and then use the objectID to do a fetch on the main thread. First, the record will be on the CoreData cache, which makes the same fetch on the main thread faster, and second, fetching with an objectID is much much much faster than fetching with your average fetch request. What you usually do is that you would create a background fetch request, find the objectID of the objects you are looking for, and move those objectIDs to the main thread. Of course, for the background thread, you have to use a different NSManagedObjectContext instance than the main thread one.
I would recommend you to check the WWDC 2010 video "Mastering Core Data". It goes into core data and multithreading, explaining the performance of caching and fetching on the background/main thread.

Related

Core Data - sharing NSManagedObjects among multiple threads

I suffered all the consequences of using a single MOC in multiple threads - my app crashes at random points because the MOC is created in the main thread and I also use it to fill the DB in another thread.
Though the code is synchronized (#synchronize) using a global singleton the app crashes.
I read that using a separate MOC for each thread will make things ok but I also read that it is considered also a bad approach to share NSManagedObjects across threads.
My use case is the following:
1)I load and parse XML from a server and during the parsing I insert each new NSManagedObject in the database. This all happens in a separate thread.
2)From the main thread the user interacts with the UI which reads data from the database.
In both threads I use NSManagedObjects. How would you suggest me to fix this? I failed multiple times already.
Most often the app creashed with error suggesting that I am modifying a collection while enumerating it which is not true as the code is synchronized and while I am iterating it no modifying happens and vice versa - while I modify it I don't iterate and I save once I am done.

Use one NSManagedObjectContext per thread. If you communicate between threads, pass the NSManagedObjectID, which is thread safe, and fetch the object again from you thread context. In my apps I sometimes even use one context per controller.
To manage the different contexts, register an Observer for the NSManagedObjectContextDidChangeNotification. Within this notification handling, you pass the notification to each of your contexts via the mergeChangesFromContextDidSaveNotification: method. This method is thread save and makes the context update its state.
After this you have to refresh your views. If you have a table view based application, have a look at NSFetchedResultsController. This helps you update the table automatically with appropriate animations. If you don't use table views, you have to implement the UI update yourself.

If you are only supporting iOS 5 and above you don't need to deal with NSManagedObjectID and merging contexts anymore. You can use the new concurrency types of NSManagedObjectContext instead. Then do your operations within managedObjectContext:performBlock and they will be merged automatically.
See the answer from svena here for more information:
Core Data and Concurrency using NSOperationQueues

iphone - Should I use NSOperationQueue and NSOperation instead of NSThread?

I am facing a design problem of my app.
Basically, the followings are what I am going to do in my app.
A single task is like this:
Read custom object from the underlying CoreData databse
Download a json from a url
Parse the json to update the custom object or create a new one (parsing may take 1 - 3 secs, big data)
Analyse the custom object (some calculations will be involved, may take 1 - 5 sec)
Save the custom object into CoreData database.
There may be a number of tasks being executed concurrently.
The steps within one task obviously are ordered (i.e., without step 2 downloading the json, step 3 cannot continue), but they also can be discrete. I mean, for example, task2's step 4 can be executed before task1's step 3 (if maybe task2's downloading is faster than task1's)
Tasks have priorities. User can start a task with higher priority so all the task's steps will be tried to be executed before all others.
I hope the UI can be responsive as much as possible.
So I was going to creating a NSThread with lowest priority.
I put a custom priority event queue in that thread. Every step of a task becomes an event (work unit). So, for example, step 1 downloading a json becomes an event. After downloading, the event generates another event for step 3 and be put into the queue. every event has its own priority set.
Now I see this article: Concurrency and Application Design. Apple suggests that we Move Away from Threads and use GCD or NSOperation.
I find that NSOperation match my draft design very much. But I have following questions:
In consideration of iPhone/iPad cpu cores, should I just use one NSOperationQueue or create multiple ones?
Will the NSOperationQueue or NSOperation be executed with lowest thread priority? Will the execution affect the UI response (I care because the steps involve computations)?
Can I generate a NSOpeartion from another one and put it to the queue? I don't see a queue property in NSOperation, how do I know the queue?
How do I cooperate NSOperationQueue with CoreData? Each time I access the CoreData, should I create a new context? Will that be expensive?
Each step of a task become a NSOperation, is this design correct?
Thanks

In consideration of iPhone/iPad cpu cores, should I just use one NSOperationQueue or create multiple ones?
Two (CPU, Network+I/O) or Three (CPU, Network, I/O) serial queues should work well for most cases, to keep the app responsive and your programs streaming work by what they are bound to. Of course, you may find another combination/formula works for your particular distribution of work.
Will the NSOperationQueue or NSOperation be executed with lowest thread priority? Will the execution affect the UI response (I care because the steps involve computations)?
Not by default. see -[NSOperation setThreadPriority:] if you want to reduce the priority.
Can I generate a NSOpeartion from another one and put it to the queue? I don't see a queue property in NSOperation, how do I know the queue?
Sure. If you use the serial approach I outlined, locating the correct queue is easy enough -- or you could use an ivar.
How do I cooperate NSOperationQueue with CoreData? Each time I access the CoreData, should I create a new context? Will that be expensive?
(no comment)
Each step of a task become a NSOperation, is this design correct?
Yes - dividing your queues to the resource it is bound to is a good idea.

By the looks, NSOperationQueue is what you're after. You can set the number of concurrent operations to be run at the same time. If using multiple NSOperation, they will all run at the same time ... unless you handle a queue on your own, which will be the same as using NSOperationQueue
Thread priority ... I'm not sure what you mean, but in iOS, the UI drawing, events and user interaction are all run on the main thread. If you are running things on the background thread, the interface will still be responsive, no matter how complicated or cpu-heavy operations you are running
Generating and handling of operations you should do it on the main thread, as it won't take any time, you just run them in a background thread so that your main thread doesn't get locked
CoreData, I haven't worked much with it specifically, but so far every Core~ I've worked with it works perfectly on background threads, so it shouldn't be a problem
As far as design goes, it's just a point of view ... As for me, I would've gone with having one NSOperation per task, and have it handle all the steps. Maybe write callbacks whenever a step is finished if you want to give some feedback or continue with another download or something

The affection of computation when multithreading is not going to be different just because you are using NSThread instead of NSOperation. However keep in mind that must current iOS devices are using dual core processors.
Some of the questions you have are not very specific. You may or may not want to use multiple NSOperationQueue. It all depends on how you want to approach it. if you have different NSOperation subclasses, or different NSBlockOperations, you can manage order of execution by using priorities, or you might want to have different queues for different types of operations (especially when working with serial queues). I personally prefer to use 1 operation queue when dealing with the same type of operation, and have a different operation queue when the operations are not related/dependable. This gives me the flexibility to cancel and stop the operations within a queue based on something happening (network dropping, app going to the background).
I have never found a good reason to add an operation based on something happening during the execution of a current operation. Should you need to do so, you can use NSOperationQueue's class method, currentQueue, which will give you the operation queue in which the current operation is operating.
If you are doing core data work using NSOperation, i would recommend to create a context for each particular operation. Make sure to initialize the context inside the main method, since this is where you are on the right thread of the NSOperation background execution.
You do not necessarily need to have one NSOperation object for each task. You can download the data and parse it inside the NSOperation. You can also do the data download abstractly and do the data manipulation of the content downloaded using the completion block property of NSOperation. This will allow you to use the same object to get the data, but have different data manipulation.
My recommendation would be to read the documentation for NSOperation, NSBlockOperation and NSOperationQueue. Check your current design to see how you can adapt these classes with your current project. I strongly suggest you to go the route of the NSOperation family instead of the NSThread family.
Good luck.

Just to add to #justin's answer
How do I cooperate NSOperationQueue with CoreData? Each time I access
the CoreData, should I create a new context? Will that be expensive?
You should be really careful when using NSOperation with Core Data.
What you always have to remember here is that if you want to run CoreData operations on a separate thread you have to create a new NSManagedObjectContext for that thread, and share the main's Managed Object Context persistant store coordinator (the "main" MOC is the one in the app delegate).
Also, it's very important that the new Managed Object Context for that thread is create from that thread.
So if you plan to use Core Data with NSOperation make sure you initialize the new MOC in NSOperation's main method instead of init.
Here's a really good article about Core Data and threading

Use GCD - its a much better framework than NS*
Keep all your CoreData access on one queue and dispatch_async at the end of your routines to save back to your CoreData database.
If you have a developer account, check this WWDC video out: https://developer.apple.com/videos/wwdc/2012/?id=712

Using fetch requests on background threads with Core Data

I want to use one ManagedObjectContext for the main thread and another, separate one for the background thread using NSOperation, as Apple suggests. And, each ManagedObjectContext shares the same persistent store.
Fetching could happen on the main thread because I use Core Data to populate a table view.
In the background, I need to access an NSManagedObject property that stores the name of an image. Then, the background thread will create and cache these images, which is the main reason for having the background thread.
Given such, is there any danger (like locking) if both threads attempt to access the persistent store because both could be fetching data from it the same time?

Each thread requires its own managed object context, but all threads need to share a single persistent store coordinator - that will take care of potential issues you're describing. See additional information in the Core Data Concurrency Programming Guide.

Two different MOCs can access the same PSC at the same time for reads.
However for writes, you need to lock and unlock your persistent store coordinator if there is a chance of concurrent writes.

As long as each thread uses its own NSManagedObjectContext, it's perfectly safe to have them share an NSPersistentStoreCoordinator. NSManagedObjectContext will deal with doing all the appropriate locking of the persistent store as needed. However, you do have to be careful not to share NSManagedObjects between threads.

iPhone programming - Background saving with core data

I am trying to save data into core data in the background thread, as it takes quite some time to save.
I did:
[self performSelectorInBackGround:#selector(insertRecord:) withObject:data];
When all works fine until the line in insertRecord Method hits contextsave:&error. Program received signal : "SIGABRT"
Am I doing anything wrong? it works ok when its in main thread, I just move the codes to another method and run it in background and it doesn't work anymore.

According to the "Concurrency with Core Data" section of Core Data Programming Guide:
The pattern recommended for concurrent programming with Core Data is
thread confinement: each thread must have its own entirely private
managed object context.
and
Using thread confinement, you should not pass managed objects or
managed object contexts between threads.
It looks like you're passing a managed object to the background thread, which is forbidden. I don't know if you're also trying to share your managed object context between threads.
That document describes a couple of workarounds for passing managed objects to other threads. You'll need to implement one of them.

The problem here is that managed object contexts aren't thread-safe. If your -insertRecord: method uses the main thread's managed object context, you're asking for trouble.
The blog Cocoa Is My Girlfriend has an article, Core Data and Threads, Without the Headache on this very topic and suggests some strategies for saving in the background. The basic idea is to make changes on a context that belongs to a background thread, and then merge the changes into the main thread's context. That gives you an up-to-date context that you can save in the background while still keeping the main thread's context current.

How to efficiently save changes made in UI/main thread with Core Data?

So, there have been several posts here about importing and saving data from an external data source into Core Data. Apple documents a reasonable pattern for this: "import and save on background thread, merge saved objects to main thread." All fine and good.
I have a related but different problem: the user is modifying data in the UI and main thread, and thus modifies state of some objects in the managed object context (MOC). I would like to save these changes from time to time. What is a good way to do that?
Now, you could say that I could do the same: create a background thread with its own MOC and pass the changed objectID-s there. The catch-22 for me with this is that an object's ID changes when it is saved, and I cannot guarantee the order of things happening. I may end up passing a different objectID into the background thread for the same object, based on whether the object has been previously saved or not, and I don't know if Core Data can resolve this and see that different objectID-s are pointing to the same object and not create duplicates for me. (I could test this, but I'm lazywebbing with this question first.)
One thought I had: I could always do MOC saves on a background thread, and queue them up with operationqueue, so that there is always only one save in progress. I would not create a new MOC, I would just use the same MOC as in main thread. Now, this is not thread safe and when someone modifies the MOC in main thread while it is being saved in background thread, the results will probably be catastrophic. But, minus the thread safety, you can see what kind of solution I'd wish for.
To be clear, the problem I need to fix is that if I just do the save in main thread, it blocks the UI for an unacceptably long period of time, I want to move the save to background thread.
So, questions:
what about the reasoning of an object ID changing during saving, and Core Data being able to resolve them to the same object? Would this be the right way of addressing this problem?
any other good ways of doing this?

Simply put, you can't move the save to the background thread. Changes are relative to the NSManagedObjectContext and therefore are invisible to a NSManagedObjectContext on another thread.
I would suggest profiling your saves to find out why they are taking so long. Perhaps make them more frequently or find out what else might be causing the performance issue.
You are using a SQLite store right?
UPDATE
If you are using Binary it is definitely going to be an issue as I believe I mentioned to you before. Binary must be loaded 100% into memory and therefore must also be written 100% out to disk.

It may not solve all of your troubles, but there is a method -[NSManagedObjectContext obtainPermanentIDsForObjects:error:] that you can use to obtain the permanent ID for a managed object prior to saving it to the store. So that should be helpful in any synchronization you end up doing.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse