Multithreaded use of Core Data (NSOperationQueue and NSManagedObjectContext)

Multithreaded use of Core Data (NSOperationQueue and NSManagedObjectContext) - iphone

In Apple's Core Data documentation for Concurrency with Core Data, they list the preferred method for thread safety as using a seperate NSManagedObjectContext per thread, with a shared NSPersistentStoreCoordinator.
If I have a number of NSOperations running one after the other on an NSOperationQueue, will there be a large overhead creating the context with each task?
With the NSOperationQueue having a max concurrent operation count of 1, many of my operations will be using the same thread. Can I use the thread dictionary to create one NSManagedObjectContext per thread? If I do so, will I have problems cleaning up my contexts later?
What’s the correct way to use Core Data in this instance?

The correct way to use Core Data in this case is to create a separate NSManagedObjectContext per operation or to have a single context which you lock (via -[NSManagedObjectContext lock] before use and -[NSManagedObjectContext unlock] after use). The locked approach might make sense if the operations are serial and there are no other threads using the context.
Which approach to use is an empirical question that can't be asnwered without data. There are too many variables to have a general rule. Hard numbers from performance testing are the only way to make an informed decision.

Operations started using NSOperationQueue using a maximum concurrent operation count of 1 will not run all operations on the same thread. The operations will be executed one after the other, but a new thread will be created every time.
So creating objects in the thread dictionary will be of little use.

While this question is old, it's actually at the top of Google's search results on 'NSMangedObjectContext threading', so, I'll just drop in a new answer.
The new 'preferred' method is to use the initWithConcurrencyType: and tell the MOC whether it's a main thread MOC or a secondary thread moc. You can then use the new performBlock: and performBlockAndWait: methods on it and the MOC will take care of serializing operations on it's 'native' thread.
The issue then becomes how do you intelligently handle merging the data between the various MOCs your application may spawn, along with a thousand other details that make life 'fun' as a programmer.

Related

sqlite3 multithreading in objective c

I am running a background thread in my application with dispatch_async and some times my main thread and background thread both access the database same time and this was giving me a database error and I tried to solve it by using sqlite3_threadsafe() which was always returning 2 i.e, i cannot use the same database connection in two threads and I want it to return 1 can anyone help me in this regard

I think you're pursuing the wrong approach. SQLite isn't reliably threadsafe no matter how you compile it — see the FAQ. Even if SQLite is compiled for thread safety, the same database may not be used from multiple threads if any transactions are pending or any statements remain unfinalised.
The recommendations above to use Grand Central Dispatch to funnel all SQLite accesses into a serial dispatch queue are the right way to proceed in my opinion, though I'd be more likely to recommend that you create your own queue rather than use the main queue for the simple reason that you can reliably dispatch_sync when you want to wait on a result.

You can add all of your access statements to, therefore you are always on the main thread when access the sqlite store.
dispatch_async(dispatch_get_main_queue(), ^ {
//code goes here
});

iOS5 NSManagedObjectContext Concurrency types and how are they used?

Literature seems a bit sparse at the moment about the new NSManagedObjectContext concurrency types. Aside from the WWDC 2011 vids and some other info I picked up along the way, I'm still having a hard time grasping how each concurrency type is used. Below is how I'm interpreting each type. Please correct me if I'm understanding anything incorrectly.
NSConfinementConcurrencyType
This type has been the norm over the last few years. MOC's are shielded from each thread. So if thread A MOC wants to merge data from Thread B MOC via a save message, thread A would need to subscribe to thread B's MOC save notification.
NSPrivateQueueConcurrencyType
Each MOC tree (parent & children MOCs) share the same queue no matter what thread each is on. So whenever a save message from any of these contexts is sent, it is put in a private cue specifically made only for this MOC tree.
NSMainQueueConcurrencyType
Still confused by this one. From what I gather it's the like NSPrivateQueueConcurrencyType, only the private queue is run on the main thread. I read that this is beneficial for UI communications with the MOC, but why? Why would I choose this over NSPrivateQueueConcurrencyType? I'm assuming that since the NSMainQueueConcurrencyType is executed in the main thread, does this not allow for background processes? Isn't this the same as not using threads?

The queue concurrency types help you to manage mutlithreaded core data:
For both types, the actions only happen on the correct queue when you do them using one of the performBlock methods. i.e.
[context performBlock:^{
dataObject.title = #"Title";
[context save:nil]; // Do actual error handling here
}];
The private queue concurrency type does all it's work in a background thread. Great for processing or disk io.
The main queue type just does all it's actions on a UIThread. That's neccesary for when you need to
do things like bind a NSFetchedResultsController up to it, or any other ui related tasks that need to be interwoven with processing that context's objects.
The real fun comes when you combine them. Imagine having a parent context that does all io on a background thread that is a private queue context, and then you do all your ui work against a child context of the main queue type. Thats essentially what UIManagedDocument does. It lets you keep you UI queue free from the busywork that has to be done to manage data.

I think the answers are in the note :
Core Data Release Notes for Mac OS X Lion
http://developer.apple.com/library/mac/#releasenotes/DataManagement/RN-CoreData/_index.html
For NSPrivateQueueConcurrencyType, I think you are not right.
A child context created with this concurrency type will have its own queue.
The parent/child context is not entirely related to threading.
The parent/child seems to simplify communication between contexts.
I understand that you just have to save changes in the child contexts to bring them back in the parent context (I have not tested it yet).
Usually parent/child context pattern are related to main queue/background queue pattern but it is not mandatory.
[EDIT] It seems that access to the store (Save and Load) are done via the main context (in the main queue). So it is not a good solution to perform background fetches as the query behind executeFetchRequest will always be performed in the main queue.
For NSMainQueueConcurrencyType, it is the same as NSPrivateQueueConcurrencyType, but as it is related to main queue, I understand that you perform operation with the context without necesseraly using performBlock ; if you are in the context of the main queue, in View controller delegate code for example
(viewDidLoad, etc).

midas06 wrote:
Imagine having a parent context that does all io on a background
thread that is a private queue context, and then you do all your ui
work against a child context of the main queue type.
I understood it to be the other way around: you put the parent context on the main thread using NSMainQueueConcurrencyType and the child context on the background thread using NSPrivateQueyeConcurrencyType. Am I wrong?

Faster way than -forwardInvocation: to perform messages on a specific thread

To improve responsiveness, some synchronous methods that used FMDB to execute SQLite queries on the main thread were rewritten to be asynchronous, and run in the background via -performSelectorInBackground:withObject:. SQLite not being thread-safe, however, each of these methods would ultimately call -[FMDatabase open], decreasing overall performance.
So, I wrote a proxy for the FMDB classes, which overrode -forwardInvocation: to perform -[NSInvocation invokeWithTarget:] on one specific thread, via -performSelector:onThread:withObject:waitUntilDone:. This solved the problem of too many calls to -[FMDatabase open], but -forwardInvocation: itself is rather expensive.
Is there a good way to solve this performance issue without rewriting all of the code that calls FMDB methods?

You've found the problem: don't call -performSelectorInBackground:withObject:! There's no guarantee what it'll do, but it probably won't do the right thing.
If what you want is a single "database thread" for background ops, then there are several options:
Create a new database thread and run loop and use -performSelector:onThread:... instead.
Create an NSOperationQueue with maxConcurrentOperationCount=1 and use NSOperation (NSInvocationOperation, perhaps?), or a serial dispatch queue. This isn't quite right: the operations/blocks are not guaranteed to be executed on the same thread, which might break sqlite (IIRC you can only move a DB handle between threads after freeing all statements)
Use NSOperationQueue, but save a thread-local reference to the database in [[NSThread currentThread] threadDictionary]. This is a bit messy, since you have little control over when the database disappears. It also might violate the NSOperation contract (you're supposed to return the thread to its original state when the operation finishes).

Strategy for reading and writing to threaded data structure

I have a class that contains an NSDictionary and periodically, I have a thread writing data into this NSDictionary. Then at other times, I have another view controller reading data out of the class's NSDictionary.
What's the best objective-c way to make the data in this class thread-safe so that if you were to ask data for 'read', you're getting the correct version aka, the last written version and not the one that maybe getting written to currently?

As Carl mentioned, #synchronized is one option.
If you are targeting iOS 4.0+, another one is using a Grand Central Dispatch queue to regulate access to a shared data structure from multiple threads/queues. The WWDC 2010 Session 211 video has a good explanation of this technique.
In a nutshell: you create a custom GCD queue (dispatch_queue_create()) whose single responsibility is to regulate access to the shared data structure. All code that accesses the shared structure then must do so from inside this queue. Because queues only execute one block of code at a time, no two threads can access the data structure at the same time.

You are looking for #synchronize, I think.

Should an NSLock instance be "global"?

Should I make a single NSLock instance in the application delegate, to be used by all classes? Or is it advisable to have each class instantiate its own NSLock instance as needed?
Would the locking work in the second case, if I, for example, had access to a managed object context that is spread across two view controllers?

If multiple objects access your object only to read its contents, then you do not need a lock at all. If at least one of the objects accesses your object to write/update its contents, then it does not matter if the other objects access your object to read or write/update it: in this case you need a lock.
Now, in order to correctly protect your object (in a critical section of code where multiple objects may access it), you must use the SAME LOCK INSTANCE which must then be shared by ALL of the possible objects accessing the object you are willing to protect.
If your application need to protect an object that may be accessed simultaneously by the majority of the classes, then having a single lock instance is fine. If you want better performances (especially if the number of simultaneous accesses to your object is high), then you can have multiple locks. Each lock will be responsible for allowing/denying access to a specific attribute/field of your object. This way, several objects may access your object changing a different attribute/field simultaneously. You are basically incrementing the number of concurrent operations on your object. However, each lock MUST STILL be shared among the other objects that will access the object you are protecting.
Having a lock instance for each controller simply does NOT work; this will NOT protect your object from concurrent accesses from other objects in different threads. NSLock is implemented using POSIX pthread mutexes, so it must be used in exactly the same way. This is also clearly stated in the NSLock documentation:
Warning: The NSLock class uses POSIX threads to implement its locking behavior. When sending an unlock message to an NSLock object, you must be sure that message is sent from the same thread that sent the initial lock message. Unlocking a lock from a different thread can result in undefined behavior.
So, in order to preserve the critical section semantics, it is the same thread that acquired the lock that is responsible for releasing it when done. Note also that the locking mechanism is intended for fast operations only, i.e. you should acquire a lock only for a short period of time before releasing it. If you need to wait for an unpredictable amount of time, then you need a different synchronization mechanism, namely a condition variable which is available through the NSCondition class.
Hope this helps.

You should not use locks with Core Data. That documentation is probably out of date. Ideally you should have one context per thread and let the context handle the locking of its underlying NSPersistentStoreCoordinator. This is considered the only safe way to use Core Data in a multi-threaded application currently.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse