I am facing a design problem of my app.
Basically, the followings are what I am going to do in my app.
A single task is like this:
Read custom object from the underlying CoreData databse
Download a json from a url
Parse the json to update the custom object or create a new one (parsing may take 1 - 3 secs, big data)
Analyse the custom object (some calculations will be involved, may take 1 - 5 sec)
Save the custom object into CoreData database.
There may be a number of tasks being executed concurrently.
The steps within one task obviously are ordered (i.e., without step 2 downloading the json, step 3 cannot continue), but they also can be discrete. I mean, for example, task2's step 4 can be executed before task1's step 3 (if maybe task2's downloading is faster than task1's)
Tasks have priorities. User can start a task with higher priority so all the task's steps will be tried to be executed before all others.
I hope the UI can be responsive as much as possible.
So I was going to creating a NSThread with lowest priority.
I put a custom priority event queue in that thread. Every step of a task becomes an event (work unit). So, for example, step 1 downloading a json becomes an event. After downloading, the event generates another event for step 3 and be put into the queue. every event has its own priority set.
Now I see this article: Concurrency and Application Design. Apple suggests that we Move Away from Threads and use GCD or NSOperation.
I find that NSOperation match my draft design very much. But I have following questions:
In consideration of iPhone/iPad cpu cores, should I just use one NSOperationQueue or create multiple ones?
Will the NSOperationQueue or NSOperation be executed with lowest thread priority? Will the execution affect the UI response (I care because the steps involve computations)?
Can I generate a NSOpeartion from another one and put it to the queue? I don't see a queue property in NSOperation, how do I know the queue?
How do I cooperate NSOperationQueue with CoreData? Each time I access the CoreData, should I create a new context? Will that be expensive?
Each step of a task become a NSOperation, is this design correct?
Thanks
In consideration of iPhone/iPad cpu cores, should I just use one NSOperationQueue or create multiple ones?
Two (CPU, Network+I/O) or Three (CPU, Network, I/O) serial queues should work well for most cases, to keep the app responsive and your programs streaming work by what they are bound to. Of course, you may find another combination/formula works for your particular distribution of work.
Will the NSOperationQueue or NSOperation be executed with lowest thread priority? Will the execution affect the UI response (I care because the steps involve computations)?
Not by default. see -[NSOperation setThreadPriority:] if you want to reduce the priority.
Can I generate a NSOpeartion from another one and put it to the queue? I don't see a queue property in NSOperation, how do I know the queue?
Sure. If you use the serial approach I outlined, locating the correct queue is easy enough -- or you could use an ivar.
How do I cooperate NSOperationQueue with CoreData? Each time I access the CoreData, should I create a new context? Will that be expensive?
(no comment)
Each step of a task become a NSOperation, is this design correct?
Yes - dividing your queues to the resource it is bound to is a good idea.
By the looks, NSOperationQueue is what you're after. You can set the number of concurrent operations to be run at the same time. If using multiple NSOperation, they will all run at the same time ... unless you handle a queue on your own, which will be the same as using NSOperationQueue
Thread priority ... I'm not sure what you mean, but in iOS, the UI drawing, events and user interaction are all run on the main thread. If you are running things on the background thread, the interface will still be responsive, no matter how complicated or cpu-heavy operations you are running
Generating and handling of operations you should do it on the main thread, as it won't take any time, you just run them in a background thread so that your main thread doesn't get locked
CoreData, I haven't worked much with it specifically, but so far every Core~ I've worked with it works perfectly on background threads, so it shouldn't be a problem
As far as design goes, it's just a point of view ... As for me, I would've gone with having one NSOperation per task, and have it handle all the steps. Maybe write callbacks whenever a step is finished if you want to give some feedback or continue with another download or something
The affection of computation when multithreading is not going to be different just because you are using NSThread instead of NSOperation. However keep in mind that must current iOS devices are using dual core processors.
Some of the questions you have are not very specific. You may or may not want to use multiple NSOperationQueue. It all depends on how you want to approach it. if you have different NSOperation subclasses, or different NSBlockOperations, you can manage order of execution by using priorities, or you might want to have different queues for different types of operations (especially when working with serial queues). I personally prefer to use 1 operation queue when dealing with the same type of operation, and have a different operation queue when the operations are not related/dependable. This gives me the flexibility to cancel and stop the operations within a queue based on something happening (network dropping, app going to the background).
I have never found a good reason to add an operation based on something happening during the execution of a current operation. Should you need to do so, you can use NSOperationQueue's class method, currentQueue, which will give you the operation queue in which the current operation is operating.
If you are doing core data work using NSOperation, i would recommend to create a context for each particular operation. Make sure to initialize the context inside the main method, since this is where you are on the right thread of the NSOperation background execution.
You do not necessarily need to have one NSOperation object for each task. You can download the data and parse it inside the NSOperation. You can also do the data download abstractly and do the data manipulation of the content downloaded using the completion block property of NSOperation. This will allow you to use the same object to get the data, but have different data manipulation.
My recommendation would be to read the documentation for NSOperation, NSBlockOperation and NSOperationQueue. Check your current design to see how you can adapt these classes with your current project. I strongly suggest you to go the route of the NSOperation family instead of the NSThread family.
Good luck.
Just to add to #justin's answer
How do I cooperate NSOperationQueue with CoreData? Each time I access
the CoreData, should I create a new context? Will that be expensive?
You should be really careful when using NSOperation with Core Data.
What you always have to remember here is that if you want to run CoreData operations on a separate thread you have to create a new NSManagedObjectContext for that thread, and share the main's Managed Object Context persistant store coordinator (the "main" MOC is the one in the app delegate).
Also, it's very important that the new Managed Object Context for that thread is create from that thread.
So if you plan to use Core Data with NSOperation make sure you initialize the new MOC in NSOperation's main method instead of init.
Here's a really good article about Core Data and threading
Use GCD - its a much better framework than NS*
Keep all your CoreData access on one queue and dispatch_async at the end of your routines to save back to your CoreData database.
If you have a developer account, check this WWDC video out: https://developer.apple.com/videos/wwdc/2012/?id=712
So I know that NSManagedObjects are not thread safe and managedObjectIDs are, and we need a separate managedObjectContext per thread. But recently I had an issue when I was doing some core data changes in the background (had a separate runloop thread for this) and performSelectorOnThread: method sometimes was simply not invoked on this runloop thread. It turned out that the reason was that I was doing
[someObject.managedObjectContext save:&error]
on this runloop thread and "someObject" was created on the main thread. But it would only "hung" runloop thread once in a while. So the question is what really happens if you try to save context in a different thread. I'm just looking for a deeper understanding, thanks.
From https://developer.apple.com/library/mac/#documentation/Cocoa/Reference/CoreDataFramework/Classes/NSManagedObjectContext_Class/NSManagedObjectContext.html :
Core Data uses thread (or serialized queue) confinement to protect
managed objects and managed object contexts (see “Concurrency with
Core Data”). A consequence of this is that a context assumes the
default owner is the thread or queue that allocated it—this is
determined by the thread that calls its init method. You should not,
therefore, initialize a context on one thread then pass it to a
different thread. Instead, you should pass a reference to a persistent
store coordinator and have the receiving thread/queue create a new
context derived from that.
You will crash. Maybe it will work sometimes and you won't see a crash while debugging, but you should never do this. Object contexts and the managed objects in them should only be used on the thread on which they were created. Apple's documentation is very clear about this and gives many examples of how to handle situations where you may have long running operations (slow fetches or async saves). You should read the docs that talk about threading with Core Data for more info.
Is there a way to save my NSManagedObjectContext in the background, off of the main thread? The save slows down how the app is performing, as it routinely takes about 2 seconds.
Yes, there is.
Apple recommends using one context per thread to achieve that.
You can also use GCD for that, but you need to make sure that queues do not share context and you will also need to pass object ID, not the objects themselves between the queues or threads.
See this blog entry for detailed instructions: http://www.cimgf.com/2011/05/04/core-data-and-threads-without-the-headache/
I want to make a class which as long as an instance of it is alive, keeps a thread (worker) going and when someone calls a method on it - performTaskWithData:(NSData*)data - then it should process this data in its worker thread.
If additional data is sent while an operation is taking place, then this new data/operation should be queued until the previous processing is done.
I need each instance of this helper class to hold one single worker thread (i.e. the same thread should handle all the processing).
How should I go about doing this?
NSRunLoop? Synchronize access to data block being passed?
Starting in iOS4, Grand Central Dispatch provides by far the simplest and most powerful interface to multithreaded programming.
If you're a registered developer, go watch some of the WWDC videos from 2010 about it. It's intimidating at first, but it's actually really simple and good.
You can do this directly with NSThreads and run loops. However, I would consider using NSOperationQueues, one per instance of your class and set the maximum concurrency of the queue to 1. Your performTaskWithData: would simply add a new instance of a subclass of NSOperation to the queue and that's it.
I have some simple doubts about NSOperation and GCD that I have not found answer to on the documentation.
The firs question is related to memory management:
I want to know if I need to create an Autorealease pool for the methods I wil add to the NSOperationQueue; similarly to when you run a method on different thread without NSOperations.
The next question is whether NSOperation takes care of GCD or if this needs to be done manually?
Thank you for your help!
I just saw your question here and there is a post on the apple dev forums you might be interested in. According to one of the apple guys on this thread so long as you run your NSOperation through an NSOperationQueue you do not need to create your own autorelease pool as the NSOperationQueue does it for you.
Also the docs for NSOperationQueue apparently need to be updated/corrected. On devices running iOS 4 or later NSOperationQueue does use GCD despite what the class reference documents say.
According to the documentation, you should create an NSAutoreleasePool in the main method of your NSOperation. The documentation for NSInvocationOperation and NSBlockOperation doesn't specify whether they create an autorelease pool for you, so to be safe it would be best to create one when using those classes too.
The NSOperationQueue handles queuing and executing the operations, so you shouldn't have to mess with GCD yourself for tasks related to the operation queue.