CoreData - Fetch NSManagedObject with perform and background thread - swift

I'm developing an SDK that uses only 1 NSManagedObjectContext with type of privateQueueConcurrencyType.
In order to fetch objects, i'm using perform() and then i pass the results to a closure.
I'm calling this method from a background thread and also use the result on a background thread (which might be different than the one that called it).
I know that passing objects between threads is a no-go, but i'm not satisfied with the way i handle it today.
The way I handle it, is that every NSManagedObject is mapped to "normal" Swift object and then i use the swift object.
For example:
Foreach NSManagedObject from the results, i create new Object (which is not NSManagedObject) and then i use these objects.
I would like to use the NSManagedObjects instead of creating new ones that holds similar data.
What's the best approach to do it?
Can I still use the NSManagedObject?
func getRecordsByPredicate<T: NSManagedObject>(type: T.Type, predicate: NSPredicate, success: #escaping (_ record: [T]?) -> Void, failure: #escaping () -> Void) {
self.context.perform {
let fetchRequest = NSFetchRequest<NSFetchRequestResult>(entityName: String(describing: type.self))
fetchRequest.includesPropertyValues = false
fetchRequest.predicate = predicate
do {
let results = try context.fetch(fetchRequest)
success(results as? [T])
} catch {
print(error.localizedDescription)
failure()
}
}
}

Providing an API with CoreData involved is difficult, at best.
You can not expose only the managed object in the API, since these are tied to a specific thread or dispatch queue which is private to your library. You would require the client to pass a Managed Object Context as well which defines the execution context where the client will use the managed object.
If your internal MOC and the client's MOC is not the same, the API inevitably becomes asynchronous - or it will block a thread.
You may require that this API can be used on the main thread only and your library takes care to use the same MOC as well. This of course has a couple of drawbacks, possibly making the API asynchronous would be only one of it.
Since you also cannot force to make the developer read your documentation, the first developer using your API will likely not call it from the main thread. ;)
Another alternative would be to let the client pass a closure to the API instead, which is then called from your library on the right execution context. This also makes the API asynchronous, and also requires the developer a deep understanding of CoreData as well, since she gets CoreData managed objects.
Your first approach using "Swift" values is probably the best approach to handle this. Make CoreData an "implementation detail" of your library and save the developers the hassles involved when using CoreData.

Related

Core Data - Generic Multithreading Violation

I'm running this code on app launch and it crashes with:
#0 0x00000001e4133088 in +[NSManagedObjectContext Multithreading_Violation_AllThatIsLeftToUsIsHonor] ()
There are other SO questions alluding to this but not in the sense of what I'm working with. I have a method that returns entities generically. The generic part works fine, the threading of my setup is somehow off. Here is the bare bones of my Core Data setup:
class Manager {
static let shared: NSPersistentContainer = {
/// sets up and returns the container
}
static func fetch<T: NSManagedObject>() throws -> [T] {
let request = NSFetchRequest<T>(entityName: String(describing: T.self))
do {
return try shared.viewContext.fetch(request) as [T] // <---- Crash line
} catch {
throw error
}
}
}
// ...
func fetchFoos() {
do {
let foos: [Foo] = try Manager.fetch()
}
}
The weird thing is that this only happens when I run the app on a device from Xcode. If I disconnect and just open the app without Xcode, it launched fine.
What stands out here that needs rearchitecting? I can't use iOS 15's perform or performAndWait due to supporting earlier OSs.
This code would be OK if you only ever use it on the main thread. Since you're getting this specific crash when using viewContext, you seem to be calling this code from some thread other thread. That's not allowed with Core Data; managed object contexts are not thread safe, nor are managed objects.
A quick-but-probably-inadequate fix would be to surround the fetch call with either perform or performAndWait (both part of NSManagedObjectContext). That would make it safe to use viewContext on a different thread. But the objects you fetch would still be tied to the main thread, so you couldn't use them on the current thread without using perform or performAndWait again.
A more thorough fix would be to call newBackgroundContext() to create a new context for the current thread, and use that to fetch objects. But the fetched objects could only be used on that thread. You could also use performBackgroundTask to run a closure with a temporary context-- as long as you only use the fetched objects in that closure. (These methods are part of NSPersistentContainer). Both of these fixes will probably require other changes to your code, and you'll have to consider where and when you use the fetch results to figure out what those changes are.

Is it thread safe to get context of NSManagedObject instance?

I.e. to access the managedObjectContext property of NSManagedObject from other thread? For example:
class StoredObject: NSManagedObject {
#NSManaged public var interestProperty: String
}
------- somewhere on background -------
let context = storedObject.managedObjectContext // is it safe?
context.perform { [storedObject] in
// do something with interestProperty
}
---------------------------------------
NSManagedObjectContext is not thread safe. Even if you grab the instance of such an object, using it on a different thread might lead to undefined behaviour.
This is specified in the Apple documentation (emphasis mine):
Core Data is designed to work in a multithreaded environment. However, not every object under the Core Data framework is thread safe. To use Core Data in a multithreaded environment, ensure that:
Managed object contexts are bound to the thread (queue) that they are associated with upon initialization.
Managed objects retrieved from a context are bound to the same queue that the context is bound to.
So while reading the managedObjectContext property might be thread safe, as that property is readonly, you will not be able to use it without risking race conditions. And you also need to take into consideration the lifetime of the managed object, as unless properly retained, you might end up asking a deallocated managed object for its context.

Do you need to use, and can you use, perform() and performAndWait() inside performBackgroundTask()?

If I am performing CoreData operations (delete local persistent data, fetch new data from online, save to persistent store) inside a storeContainer.performBackgroundTask() { context in ... } block,
1) Do I NEED to use context.perform() { } inside this to ensure it is thread safe?
2) CAN I use context.performAndWait() { } for part or all of the function inside the curly brackets if I wish to ensure, for example, deletion occurs before downloading and re-saving?
I'm having user crashes associated with CoreData saving which don't appear on testing. I suspect I am failing to understand something about CoreData. I haven't managed to find the answer to this question elsewhere in tutorials or StackOverflow despite searching for ages!
The main job of performBackgroundTask is to create an appropriate background context and call that context on respective queue. You don't need to use "perform" again to switch to private queue.
performAndWait is useful when ever you are on main queue but context is private and you want to finish the database update to move forward(similar cases). You don't need to call performAndWait inside perform because code inside perform executes serially. There is no harm in using though.

Safely locking variable in Swift 3 using GCD

How to lock variable and prevent from different thread changing it at the same time, which leads to error?
I tried using
func lock(obj: AnyObject, blk:() -> ()) {
objc_sync_enter(obj)
blk()
objc_sync_exit(obj)
}
but i still have multithreading issues.
Shared Value
If you have a shared value that you want to access in a thread safe way like this
var list:[Int] = []
DispatchQueue
You can create your own serial DispatchQueue.
let serialQueue = DispatchQueue(label: "SerialQueue")
Dispatch Synch
Now different threads can safely access list, you just need to write the code into a closure dispatched to your serial queue.
serialQueue.sync {
// update list <---
}
// This line will always run AFTER the closure on the previous line 👆👆👆
Since the serial queue executes the closures one at the time, the access to list will be safe.
Please note that the previous code will block the current thread until the closure is executed.
Dispatch Asynch
If you don't want to block the current thread until the closure is processed by the serial queue, you can dispatch the closure asynchronously
serialQueue.async {
// update list <---
}
// This line can run BEFORE the closure on the previous line 👆👆👆
Swift's concurrency support isn't there yet. It sounds like it might be developed a bit in Swift 5. An excellent article is Matt Gallagher's Mutexes and Closure Capture in Swift, which looks at various solutions but recommends pthread_mutex_t. The choice of approach depends on other aspects of what you're writing - there's much to consider with threading.
Could you provide a specific simple example that's failing you?

Is there any point in querying realm on a background thread and resolving a ThreadSafeReference on the UI thread?

It appears that ThreadSafeReference was added recently to help move across thread boundaries. Prior, according to the sources I read (which were probably not exhaustive) the recommendation to was to just query realm on the thread you intend to use the results on; effectively query it on the UI thread.
Is there a benefit to querying Realm on a background thread or does resolving the ThreadSafeReference basically run the query again?
Using RxSwift here's an example of this:
import RxSwift
import RealmSwift
public static func getAllMyModels() -> Observable<Results<MyModel>>{
return Observable<ThreadSafeReference<Results<MyModel>>>.create{
observer in
// using this queue in this example only
DispatchQueue.global(qos: .default).async {
let realm = try! Realm()
let models = realm.objects(MyModel.self)
let safe = ThreadSafeReference(to: models)
observer.onNext(safe)
observer.onCompleted()
}
return Disposables.create()
}
.observeOn(MainScheduler.instance) // push us back to the UI thread to resolve the reference
.map{
safeValue in
let realm = try! Realm()
let value = realm.resolve(safeValue)!
return value
}
.shareReplayLatestWhileConnected()
}
Did I gain anything by querying on some background thread and resolving on the UI thread?
Seems unnecessary. According to the docs, queries are already being done on a background thread, as long as you have attached a notification block:
Once the query has been executed, or a notification block has been added, the Results is kept up to date with changes made in the Realm, with the query execution performed on a background thread when possible.
- https://realm.io/docs/swift/latest/#queries
ast's guidance is correct, but I dug a little more and wanted to post some extra to confirm his answer further.
kishikawa-katsumi, currently a software engineer at Realm, provided this response to the question in Realm's public slack (https://realm-public.slack.com/archives/general/p1488960777001796):
For querying, it is fast enough in UI thread in most cases. If you're facing about a few slow complex queries, you can use background query.
To execute queries in the background, use addNotificationBlock ().
notificationToken = realm
.objects(...)
.filter(...)
.addNotificationBlock { (changes) in
// The query is executed in background.
// When the query is completed, then call this block
...
}
Using addNotificationBlock(), the query is excuted in background, when the query is completed, then call the callback closure will be called.
So ThreadSafeReference is rarely used in queries. ThreadSafeReference is used when you want to pass an object to another thread (for example, to specify it as a condition of a query or to use it as a parameter of an API request).
Additional information about subscribing to this block from a GCD thread (background thread) can be found here, as it requires a runloop.
https://stackoverflow.com/a/41841847/1060314