Core Data - Generic Multithreading Violation - swift

I'm running this code on app launch and it crashes with:
#0 0x00000001e4133088 in +[NSManagedObjectContext Multithreading_Violation_AllThatIsLeftToUsIsHonor] ()
There are other SO questions alluding to this but not in the sense of what I'm working with. I have a method that returns entities generically. The generic part works fine, the threading of my setup is somehow off. Here is the bare bones of my Core Data setup:
class Manager {
static let shared: NSPersistentContainer = {
/// sets up and returns the container
}
static func fetch<T: NSManagedObject>() throws -> [T] {
let request = NSFetchRequest<T>(entityName: String(describing: T.self))
do {
return try shared.viewContext.fetch(request) as [T] // <---- Crash line
} catch {
throw error
}
}
}
// ...
func fetchFoos() {
do {
let foos: [Foo] = try Manager.fetch()
}
}
The weird thing is that this only happens when I run the app on a device from Xcode. If I disconnect and just open the app without Xcode, it launched fine.
What stands out here that needs rearchitecting? I can't use iOS 15's perform or performAndWait due to supporting earlier OSs.

This code would be OK if you only ever use it on the main thread. Since you're getting this specific crash when using viewContext, you seem to be calling this code from some thread other thread. That's not allowed with Core Data; managed object contexts are not thread safe, nor are managed objects.
A quick-but-probably-inadequate fix would be to surround the fetch call with either perform or performAndWait (both part of NSManagedObjectContext). That would make it safe to use viewContext on a different thread. But the objects you fetch would still be tied to the main thread, so you couldn't use them on the current thread without using perform or performAndWait again.
A more thorough fix would be to call newBackgroundContext() to create a new context for the current thread, and use that to fetch objects. But the fetched objects could only be used on that thread. You could also use performBackgroundTask to run a closure with a temporary context-- as long as you only use the fetched objects in that closure. (These methods are part of NSPersistentContainer). Both of these fixes will probably require other changes to your code, and you'll have to consider where and when you use the fetch results to figure out what those changes are.

Related

The callback inside Task is automatically called on the main thread

Once upon the time, before Async/Await came, we use to make a simple request to the server with URLSession dataTask. The callback being not automatically called on the main thread and we had to dispatch manually to the main thread in order to perform some UI work. Example:
DispatchQueue.main.async {
// UI work
}
Omitting this will lead to the app to crash since we try to update the UI on different queue than the main one.
Now with Async/Await things got easier. We still have to dispatch to the main queue using MainActor.
await MainActor.run {
// UI work
}
The weird thing is that even when I don't use the MainActor the code inside my Task seems to run on the main thread and updating the UI seems to be safe.
Task {
let api = API(apiConfig: apiConfig)
do {
let posts = try await api.getPosts() // Checked this and the code of getPosts is running on another thread.
self.posts = posts
self.tableView.reloadData()
print(Thread.current.description)
} catch {
// Handle error
}
}
I was expecting my code to lead to crash since I am trying to update the table view theorically not from the main thread but the log says I am on the main thread. The print logs the following:
<_NSMainThread: 0x600003bb02c0>{number = 1, name = main}
Does this mean there is no need to check which queue we are in before performing UI stuff?
Regarding Task {…}, that will “create an unstructured task that runs on the current actor” (see Swift Concurrency: Unstructured Concurrency). That is a great way to launch an asynchronous task from a synchronous context. And, if called from the main actor, this Task will also be on the main actor.
In your case, I would move the model update and UI refresh to a function that is marked as running on the main actor:
#MainActor
func update(with posts: [Post]) async {
self.posts = posts
tableView.reloadData()
}
Then you can do:
Task {
let api = API(apiConfig: apiConfig)
do {
let posts = try await api.getPosts() // Checked this and the code of getPosts is running on another thread.
self.update(with: posts)
} catch {
// Handle error
}
}
And the beauty of it is that if you’re not already on the main actor, the compiler will tell you that you have to await the update method. The compiler will tell you whether you need to await or not.
If you haven’t seen it, I might suggest watching WWDC 2021 video Swift concurrency: Update a sample app. It offers lots of practical tips about converting code to Swift concurrency, but specifically at 24:16 they walk through the evolution from DispatchQueue.main.async {…} to Swift concurrency (e.g., initially suggesting the intuitive MainActor.run {…} step, but over the next few minutes, show why even that is unnecessary, but also discuss the rare scenario where you might want to use this function).
As an aside, in Swift concurrency, looking at Thread.current is not reliable. Because of this, this practice is likely going to be prohibited in a future compiler release.
If you watch WWDC 2021 Swift concurrency: Behind the scenes, you will get a glimpse of the sorts of mechanisms underpinning Swift concurrency and you will better understand why looking at Thread.current might lead to all sorts of incorrect conclusions.

In Swift, if Thread.current.isMainThread == false, then is it safe to DispatchQueue.main.sync recursively once?

In Swift, if Thread.current.isMainThread == false, then is it safe to DispatchQueue.main.sync recursively once?
The reason I ask is that, in my company's app, we had a crash that turned out to be due to some UI method being called from off the main thread, like:
public extension UIViewController {
func presentModally(_ viewControllerToPresent: UIViewController, animated flag: Bool, completion: (() -> Void)? = nil) {
// some code that sets presentation style then:
present(viewControllerToPresent, animated: flag, completion: completion)
}
}
Since this was getting called from many places, some of which would sometimes call it from a background thread, we were getting crashes here and there.
Fixing all the call sites was not feasible due to the app being over a million lines of code, so my solution to this was simply to check if we're on the main thread, and if not, then redirect the call to the main thread, like so:
public extension UIViewController {
func presentModally(_ viewControllerToPresent: UIViewController, animated flag: Bool, completion: (() -> Void)? = nil) {
guard Thread.current.isMainThread else {
DispatchQueue.main.sync {
presentModally(viewControllerToPresent, animated: flag, completion: completion)
}
return
}
// some code that sets presentation style then:
present(viewControllerToPresent, animated: flag, completion: completion)
}
}
The benefits of this approach seem to be:
Preservation of execution order. If the caller is off the main thread, we'll redirect onto the main thread, then execute the same function before we return -- thus preserving the normal execution order that the would have happened had the original function been called from the main thread, since functions called on the main thread (or any other thread) execute synchronously by default.
Ability to implicitly reference self without compiler warnings. In Xcode 11.4, performing this call synchronously also satisfies the compiler that it's OK to implicitly retain self, since the dispatch context will be entered then exited before the original function call returns -- so we don't get any new compiler warnings from this approach. That's nice and clean.
More focused diffs via less indentation. It avoids wrapping the entire function body in a closure (like you'd normally see done if Dispatch.main.async { ... } was used, where the whole body must now be indented a level deeper, incurring whitespace diffs in your PR that can lead to annoying merge conflicts and make it harder for reviewers to distinguish the salient elements in GitHub's PR diff views).
Meanwhile the alternative, DispatchQueue.main.async, would seem to have the following drawbacks:
Potentially changes expected execution order. The function would return before executing the dispatched closure, which in turn means that self could have deallocated before it runs. That means we'd have to explicitly retain self (or weakify it) to avoid a compiler warning. It also means that, in this example, present(...) would not get called before the function would return to the caller. This could cause the modal to pop-up after some other code subsequent to the call site, leading to unintended behavior.
Requirement of either weakifying or explicitly retaining self. This is not really a drawback but it's not as clean, stylistically, as being able to implicitly retain self.
So the question is: are these assumptions all correct, or am I missing something here?
My colleagues who reviewed the PR seemed to feel that using "DispatchQueue.main.sync" is somehow inherently bad and risky, and could lead to a deadlock. While I realize that using this from the main thread would indeed deadlock, here we explicitly avoid that here using a guard statement to make sure we're NOT on the main thread first.
Despite being presented with all the above rationale, and despite being unable to explain to me how a deadlock could actually happen given that the dispatch only happens if the function gets called off the main thread to begin with, my colleagues still have deep reservations about this pattern, feeling that it could lead to a deadlock or block the UI in unexpected ways.
Are those fears founded? Or is this pattern perfectly safe?
This pattern is definitely not “perfectly” safe. One can easily contrive a deadlock:
let group = DispatchGroup()
DispatchQueue.global().async(group: group) {
self.presentModally(controller, animated: true)
}
group.wait()
Checking that isMainThread is false is insufficient, strictly speaking, to know whether it’s safe to dispatch synchronously to the main thread.
But that’s not the real issue. You obviously have some routine somewhere that thinks it’s running on the main thread, when it’s not. Personally, I’d be worried about what else that code did while operating under this misconception (e.g. unsynchronized model updates, etc.).
Your workaround, rather than fixing the root cause of the problem, is just hiding it. As a general rule, I would not suggest coding around bugs introduced elsewhere in the codebase. You really should just figure out where you’re calling this routine from a background thread and resolve that.
In terms of how to find the problem, hopefully the stack trace associated with the crash will tell you. I’d also suggest adding a breakpoint for the main thread checker by clicking on that little arrow next to it in the scheme settings:
Then exercise the app and if it encounters this issue, it will pause execution at the offending line, which can be very useful in tracking down these issues. That often is much easier than reverse-engineering from the stack trace.
I agree with the comments that you have some structural difficulties with your code.
But there are still times in which I need code to run on the main thread and I don't know if I'm already on the main thread or not. This has occurred often enough that I wrote a ExecuteOnMain() function just for this:
dispatch_queue_t MainSequentialQueue( )
{
static dispatch_queue_t mainQueue;
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
#if HAS_MAIN_RUNLOOP
// If this process has a main thread run loop, queue sequential tasks to run on the main thread
mainQueue = dispatch_get_main_queue();
#else
// If the process doesn't execute in a run loop, create a sequential utility thread to perform these tasks
mainQueue = dispatch_queue_create("main-sequential",DISPATCH_QUEUE_SERIAL);
#endif
});
return mainQueue;
}
BOOL IsMainQueue( )
{
#if HAS_MAIN_RUNLOOP
// Return YES if this code is already executing on the main thread
return [NSThread isMainThread];
#else
// Return YES if this code is already executing on the sequential queue, NO otherwise
return ( MainSequentialQueue() == dispatch_get_current_queue() );
#endif
}
void DispatchOnMain( dispatch_block_t block )
{
// Shorthand for asynchronously dispatching a block to execute on the main thread
dispatch_async(MainSequentialQueue(),block);
}
void ExecuteOnMain( dispatch_block_t block )
{
// Shorthand for synchronously executing a block on the main thread before returning.
// Unlike dispatch_sync(), this won't deadlock if executed on the main thread.
if (IsMainQueue())
// If this is the main thread, execute the block immediately
block();
else
// If this is not the main thread, queue the block to execute on the main queue and wait for it to finish
dispatch_sync(MainSequentialQueue(),block);
}
A bit late, but I had a need for this type of solution too. I had some common code that could be invoked from both the main thread and background threads, and updated the UI. My solution to the generic use case was:
public extension UIViewController {
func runOnUiThread(closure: #escaping () -> ()) {
if Thread.isMainThread {
closure()
} else {
DispatchQueue.main.sync(execute: closure)
}
}
}
Then to call it from a UIViewController:
runOnUiThread {
code here
}
As others have pointed out, this is not completely safe. You might have some code on background thread that is invoked from the main thread, synchronously. If that background code then calls the code above, it will attempt to run on the main thread and will create a deadlock. The main thread is waiting for the background code to execute, and the background code will wait for the main thread to be free.

Do you need to use, and can you use, perform() and performAndWait() inside performBackgroundTask()?

If I am performing CoreData operations (delete local persistent data, fetch new data from online, save to persistent store) inside a storeContainer.performBackgroundTask() { context in ... } block,
1) Do I NEED to use context.perform() { } inside this to ensure it is thread safe?
2) CAN I use context.performAndWait() { } for part or all of the function inside the curly brackets if I wish to ensure, for example, deletion occurs before downloading and re-saving?
I'm having user crashes associated with CoreData saving which don't appear on testing. I suspect I am failing to understand something about CoreData. I haven't managed to find the answer to this question elsewhere in tutorials or StackOverflow despite searching for ages!
The main job of performBackgroundTask is to create an appropriate background context and call that context on respective queue. You don't need to use "perform" again to switch to private queue.
performAndWait is useful when ever you are on main queue but context is private and you want to finish the database update to move forward(similar cases). You don't need to call performAndWait inside perform because code inside perform executes serially. There is no harm in using though.

CoreData - Fetch NSManagedObject with perform and background thread

I'm developing an SDK that uses only 1 NSManagedObjectContext with type of privateQueueConcurrencyType.
In order to fetch objects, i'm using perform() and then i pass the results to a closure.
I'm calling this method from a background thread and also use the result on a background thread (which might be different than the one that called it).
I know that passing objects between threads is a no-go, but i'm not satisfied with the way i handle it today.
The way I handle it, is that every NSManagedObject is mapped to "normal" Swift object and then i use the swift object.
For example:
Foreach NSManagedObject from the results, i create new Object (which is not NSManagedObject) and then i use these objects.
I would like to use the NSManagedObjects instead of creating new ones that holds similar data.
What's the best approach to do it?
Can I still use the NSManagedObject?
func getRecordsByPredicate<T: NSManagedObject>(type: T.Type, predicate: NSPredicate, success: #escaping (_ record: [T]?) -> Void, failure: #escaping () -> Void) {
self.context.perform {
let fetchRequest = NSFetchRequest<NSFetchRequestResult>(entityName: String(describing: type.self))
fetchRequest.includesPropertyValues = false
fetchRequest.predicate = predicate
do {
let results = try context.fetch(fetchRequest)
success(results as? [T])
} catch {
print(error.localizedDescription)
failure()
}
}
}
Providing an API with CoreData involved is difficult, at best.
You can not expose only the managed object in the API, since these are tied to a specific thread or dispatch queue which is private to your library. You would require the client to pass a Managed Object Context as well which defines the execution context where the client will use the managed object.
If your internal MOC and the client's MOC is not the same, the API inevitably becomes asynchronous - or it will block a thread.
You may require that this API can be used on the main thread only and your library takes care to use the same MOC as well. This of course has a couple of drawbacks, possibly making the API asynchronous would be only one of it.
Since you also cannot force to make the developer read your documentation, the first developer using your API will likely not call it from the main thread. ;)
Another alternative would be to let the client pass a closure to the API instead, which is then called from your library on the right execution context. This also makes the API asynchronous, and also requires the developer a deep understanding of CoreData as well, since she gets CoreData managed objects.
Your first approach using "Swift" values is probably the best approach to handle this. Make CoreData an "implementation detail" of your library and save the developers the hassles involved when using CoreData.

Is there any point in querying realm on a background thread and resolving a ThreadSafeReference on the UI thread?

It appears that ThreadSafeReference was added recently to help move across thread boundaries. Prior, according to the sources I read (which were probably not exhaustive) the recommendation to was to just query realm on the thread you intend to use the results on; effectively query it on the UI thread.
Is there a benefit to querying Realm on a background thread or does resolving the ThreadSafeReference basically run the query again?
Using RxSwift here's an example of this:
import RxSwift
import RealmSwift
public static func getAllMyModels() -> Observable<Results<MyModel>>{
return Observable<ThreadSafeReference<Results<MyModel>>>.create{
observer in
// using this queue in this example only
DispatchQueue.global(qos: .default).async {
let realm = try! Realm()
let models = realm.objects(MyModel.self)
let safe = ThreadSafeReference(to: models)
observer.onNext(safe)
observer.onCompleted()
}
return Disposables.create()
}
.observeOn(MainScheduler.instance) // push us back to the UI thread to resolve the reference
.map{
safeValue in
let realm = try! Realm()
let value = realm.resolve(safeValue)!
return value
}
.shareReplayLatestWhileConnected()
}
Did I gain anything by querying on some background thread and resolving on the UI thread?
Seems unnecessary. According to the docs, queries are already being done on a background thread, as long as you have attached a notification block:
Once the query has been executed, or a notification block has been added, the Results is kept up to date with changes made in the Realm, with the query execution performed on a background thread when possible.
- https://realm.io/docs/swift/latest/#queries
ast's guidance is correct, but I dug a little more and wanted to post some extra to confirm his answer further.
kishikawa-katsumi, currently a software engineer at Realm, provided this response to the question in Realm's public slack (https://realm-public.slack.com/archives/general/p1488960777001796):
For querying, it is fast enough in UI thread in most cases. If you're facing about a few slow complex queries, you can use background query.
To execute queries in the background, use addNotificationBlock ().
notificationToken = realm
.objects(...)
.filter(...)
.addNotificationBlock { (changes) in
// The query is executed in background.
// When the query is completed, then call this block
...
}
Using addNotificationBlock(), the query is excuted in background, when the query is completed, then call the callback closure will be called.
So ThreadSafeReference is rarely used in queries. ThreadSafeReference is used when you want to pass an object to another thread (for example, to specify it as a condition of a query or to use it as a parameter of an API request).
Additional information about subscribing to this block from a GCD thread (background thread) can be found here, as it requires a runloop.
https://stackoverflow.com/a/41841847/1060314