CoreData performance about context saving - iphone

I finished converting my app to use the CoreData layer for a small datawarehouse I want to use. I have some concerns about the performance and how to best use it. In particular:
I have a lot of runs where I read from disk attributes within files: each attribute should generate a new object, unless an object of that type and that value already exists. So, for each file I read, I: execute a fetch to check if that managed object already exists; if yes finish, otherwise I create the object, assign value and save context.
Currently, I save the context once for each time I create a new object, so it happens more or less ten times (for the ten attributes) for each file read (which can be hundreds). Would be better to reduce the context saving points, maybe once for file instead of once for attribute? I do not know the overhead of this operation so I don't know if is ok to do this so often, or how to find out the time spent on this (maybe with the instruments? Don't really know how).

There isn't any need to save after setting each attribute.
Normally, you only save a managed object when the code is done with it as saving resets the undo. In the set up you describe, you could safely generate hundreds of managed objects before saving them to permanent store. You can have a large number (thousands) of lightweight (text attributes) objects in memory without putting any strain on the iPhone.
The only problem on the iPhone is that you never know when the app will be suspended or shut down. This makes saves more common than on other platforms. However, not to the extent you now use.
Core Data Performance section of the guide might help you plan. Instruments allows you to see the details of Core Data performance.
However, I wouldn't do anything until you've tested the app with a great deal of data and found it slow. Premature optimization is the source of all evil. Don't waste time trying to prevent a problem you may not have.

To prevent a "sudden application stop" problem you can implement something like that method:
- (void)saveContext {
NSError *error = nil;
NSManagedObjectContext *managedObjectContext = self.managedObjectContext;
if (managedObjectContext != nil) {
if ([managedObjectContext hasChanges] && ![managedObjectContext save:&error]) {
/*
Replace this implementation with code to handle the error appropriately.
abort() causes the application to generate a crash log and terminate. You should not use this function in a shipping application, although it may be useful during development. If it is not possible to recover from the error, display an alert panel that instructs the user to quit the application by pressing the Home button.
*/
LogError(#"Unresolved error %#, %#", error, [error userInfo]);
// abort();
}
}
}
and use it inside two methods of your app delegate:
- (void)applicationWillTerminate:(UIApplication *)application;
and
- (void)applicationDidEnterBackground:(UIApplication *)application;
Thought it may not be the 100% solution, but in most of the cases it will do the work...

Related

Core Data & iCloud (iOS5)

After adding a new NSManagedObject to my Core Data store I tried calling:
if ([managedObjectContext hasChanges] && ![managedObjectContext save:&error]) {
and got the following exception (weirdly I had no error and the result was also positive!)
2013-03-15 18:32:09.753 Nick copy[28782:3407] CoreData: Ubiquity: An exception occured during a log file export: NSInternalInconsistencyException save notification contents: NSConcreteNotification 0x3891b0 {name = _NSSQLCoreTransactionStateChangeNotification; object = (URL: file://localhost/var/mobile/Applications/FCAF7FC6-7DC8-4E0B-A114-38778255CA90/Documents/MyApp.sqlite); userInfo = {
"_NSSQLCoreActiveSaveRequest" = "";
"_NSSQLCoreTransactionType" = 2;
"_NSSQLCoreTransientSequenceNumber" = 1;
}}
I can catch all exceptions from the "save" method and the App runs fine. Just wondering if this is really save to do, because it feels totally unsafe.
EDIT: Another exception when trying to delete an Object:
Catched Exception: Failed to process pending changes before save. The context is still dirty after 100 attempts. Typically this recursive dirtying is caused by a bad validation method, -willSave, or notification handler.
Is it safe? Probably not. The error shows that the underlying ubiquity system failed to create a SQL log file. That probably means that it failed to create the transaction log that iCloud would use to sync changes. Catching it and continuing means that your data probably saved locally, depending on the details of the framework code. But it almost certainly means that the changes will not be synced by iCloud. Worse, you could well be in a situation where future saves will also fail for the same reason.
I'm not completely sure about the second exception but it's very likely to be a side-effect of the first one.
If you're attempting to use Core Data's built-in iCloud support on iOS5, this is just the beginning of the weird, inexplicable errors. I've done a fair amount of iCloud/Core Data work and I really can't recommend using it with iOS 5. Even iOS 6 is dicey at best, but problems are less likely than on iOS 5.
Unfortunately I can't find the thread anymore, but it told me I had to make sure to always use NSManagedObject classes only in the thread/dispatch_queue in which they are created.
The problem is, if you do access it from a different queue, it might work or crash after a random interval.
I made sure I call NSManagedObject from a dedicated dispatch_queue only and have not logged any weird exceptions since then.

Restkit and deadlock

Currently I'm using Restkit to control all my (Core-) data in my app. I'm using it to keep in sync with the server using RKManagedObjectMapping and I use [myMyNSManagedObject createEntitity] together with [[RKObjectManager §sharedManager].objectStore save] to manually edit items within Grand Central Dispatch.
Is there any recommendation to do this in this or an other way? Because sometimes the app freezes in a deadlock executing this code of Restkit
+ (NSArray*)objectsWithFetchRequest:(NSFetchRequest*)fetchRequest {
NSError* error = nil;
NSArray* objects = [[self managedObjectContext] executeFetchRequest:fetchRequest error:&error];
if (objects == nil) {
RKLogError(#"Error: %#", [error localizedDescription]);
}
return objects;
}
with that
- (NSError*)save {
NSManagedObjectContext* moc = [self managedObjectContext];
NSError *error;
#try {
if (![moc save:&error]) {
if (self.delegate != nil && [self.delegate respondsToSelector:#selector(managedObjectStore:didFailToSaveContext:error:exception:)]) {
…
in parallel. Before I switched to Restkit I put a "context performBlockAndWait" around each entity-editing code and was on the safe side with no deadlocks. I have no other NSManagedObjectContext or something created by myself, all comes from Restkit.
In my case, the problem was that I was passing NSManagedObjects across thread boundaries, and using them on threads other than the ones on which they were fetched from their respective NSManagedObjectContext. It was really subtle in my case, as I knew that I wasn't supposed to do this, but did it accidentally anyways. Instead of passing the managed objects, I started passing the NSManagedObjectIDs (and then fetching from the local thread's MOC), and haven't encountered a deadlock since. I'd recommend you do a deep scan of your code to make sure you are only using managed objects in the threads that spawned them.
We've encountered this exact problem in our app. Basically, CoreData nested contexts are very buggy in iOS5, and thus don't work as advertised. This has many manifestations, but one of them is the problem described above, deadlocking a fetch request vs. a background operation. This is well documented, instructive quote:
NSFetchedResultsController deadlocks
You never want your application to deadlock. With NSFetchedResultsController and nested contexts, it’s pretty easy to do. Using the same UIManagedDocument setup described above, executing fetch requests in the private queue context while using NSFetchedResultsController with the main queue context will likely deadlock. If you start both at about the same time it happens with almost 100% consistency. NSFetchedResultsController is probably acquiring a lock that it shouldn’t be. This has been reported as fixed for an upcoming release of iOS.
This SO answer has a possible fix that keeps nested contexts. I've seen others like it, basically liberally applying -performAndWait: calls. We haven't tried that yet (iOS5 has single digit percentages in our user base). Otherwise, the only other "fix" we know right now is abandoning nested contexts for iOS5 (cf. this SO answer).
That CoreData continues to be fundamentally broken for multithreading (in iOS 5) is inexcusable on Apple's part. You can make a good case now that trusting Apple on CoreData (multithreading, iCloud) seems to be opening a bag of pain.

Reasons for NSManagedObjectMergeError error on [NSManagedObjectContext save:]

I have a application that combines threading and CoreData.
I am using one global NSPersistentStoreCoordinator and a main NSManagedObjectContextModel.
I have a process where I have to download 9 files simultaneously, so I created an object to handle the download (each individual download has its own object) and save it to the persistentStoreCoordinator.
In the [NSURLConnection connectionDidFinishLoading:] method, I created a new NSManagedObject and attempt to save the data (which will also merge it with the main managedObjectContext).
I think that it is failing due to multiple process trying to save to the persistentStoreCoordinator at the same time as the downloads are finishing around the same time.
What is the easiest way to eliminate this error and still download the files independently?
The NSManagedObjectContext instances know how to lock the NSPersistentStoreCoordinator. Since you are already using one NSManagedObjectContext per thread that is most likely not the issue.
It would help to know what the error is that you are getting. Unroll the NSError and look at its -userInfo. If the userInfo dictionary contains the key NSDetailedErrors. The value associated with this key will be an array that you can loop over and look at all the errors inside. That will help to determine what is going on.
It is quite possible that the error can be something as simple as validation or a missing required value and has nothing to do with the actual threading.

Core Data Saves and UI Performance

I'm wondering if there are any best practices for improving UI responsiveness while doing Core Data saves (not fetches) with large collections of managed objects.
The app I'm working on needs to download fairly large amounts of data on set intervals from a web service until complete. On each interval, a batch of data is downloaded, formatted into managed objects, and saved to Core Data. Because this process can sometimes take as long as 5 minutes until fully complete, simply adding a loading screen until everything finishes is not really an option, it takes too long. I'm also interested in doing frequent writes to Core Data, rather than one big write at the end, to keep my memory footprint low. Ideally, I'd like the user to be able to keep using the rest of the application normally, while simultaneously downloading and writing these large data sets to Core Data.
Unfortunately, what seems to be happening is that when I try to save my inserts that I put into the managed object context for each batch, that save operation blocks the user from interacting with the rest of the app (swiping tables, touching buttons, etc) until complete. For those short periods of time where a Core Data save is taking place, the app is very unresponsive.
Naturally, I've tried making those saves smaller by reducing the size of the individual batches that get downloaded per interval, but besides the inconvenience of making the whole process take longer, there will still be instances when a user's swipe is not captured, because at that particular time a core data save was happening. Reducing the size simply makes it less likely that a missed swipe or a missed touch will happen, but they still seem to happen often enough to be inconvenient.
For the inserts themselves, I've tried using two different implementations: insertNewObjectForEntityForName:inManagedObjectContext as well as setValuesForKeysWithDictionary. Both exhibit the problem I described above.
I tried prototyping a much simpler test to see performance in both the simulator and on the device, I've provided the important elements here. This example doesn't actually download anything from the web, but just writes a whole bunch of stuff to core data on set intervals from within a TableViewController. I'd love to know if anyone has any suggestions to improve responsiveness.
- (void)viewDidAppear:(BOOL)animated
{
[super viewDidAppear:animated];
timer = [NSTimer scheduledTimerWithTimeInterval:1 target:self selector:#selector(doTimerWork:) userInfo:nil repeats:YES];
}
-(void) doTimerWork:(id)sender
{
for (int i = 0; i < 1000; i++)
{
Misc * m = (Misc*)[NSEntityDescription insertNewObjectForEntityForName:#"Misc" inManagedObjectContext:managedObjectContext];
m.someDate = [NSDate date];
m.someString = #"ASDASDASD";
m.someOtherString = #"BLAH BLAH BLAH";
m.someNumber = [NSNumber numberWithInt:5];
m.someOtherNumber = [NSNumber numberWithInt:99];
m.someOtherDate = [NSDate date];
}
NSError *error;
if (![managedObjectContext save:&error]) {
NSLog(#"Experienced an error while saving to CoreData");
}
}
Typically you would download your data on a background thread and insert/update managed objects into its managed object context.
On the main thread you would register and receive the NSManagedObjectContextWillSaveNotification and use mergeChangesFromContextDidSaveNotification: to update the main managed object context.
Is this what you are doing?
Also, read Multi Threading with Core-Data.
It sounds like you need to throw your data intensive stuff with Core Data onto a separate thread, which is fortunately pretty easy in Cocoa. You can just do:
[obj performSelectorInBackground: #selector(method:) withObject: arg];
And then design things so that once that data intensive operation is finished, call:
[otherObject performSelectorOnMainThread: #selector(dataStuffIsDone:) withObject: arg waitUntilDone: NO];
At which point you can update your UI.
The main thing to remember is to always keep your UI logic on the main thread, for both proper design, and because very odd things can happen if you do anything with UIKit from a different thread, since it isn't designed to be thread safe.

Importing large datasets on iPhone using CoreData

I'm facing very annoying problem. My iPhone app is loading it's data from a network server. Data are sent as plist and when parsed, it neeeds to be stored to SQLite db using CoreData.
Issue is that in some cases those datasets are too big (5000+ records) and import takes way too long. More on that, when iPhone tries to suspend the screen, Watchdog kills the app because it's still processing the import and does not respond up to 5 seconds, so import is never finished.
I used all recommended techniques according to article "Efficiently Importing Data" http://developer.apple.com/mac/library/DOCUMENTATION/Cocoa/Conceptual/CoreData/Articles/cdImporting.html and other docs concerning this, but it's still awfully slow.
Solution I'm looking for is to let app suspend, but let import run in behind (better one) or to prevent attempts to suspend the app at all. Or any better idea is welcomed too.
Any tips on how to overcome these issues are highly appreciated!
Thanks
Instead of pushing plist files to the phone, you might want to send ready to use sqlite files. This has many advantages:
no need to import on the phone
more compact
If you always replace the whole content simply overwrite the persistent store in the device. Otherwise you may want to maintain an array as plist with all sqlites you have downloaded and then use this to add all stores to the persistentStoreCoordinator.
Bottom line: use several precompiled sqlite files and add them to the persistentStoreCoordinator.
You can use the iPhone Simulator to create those CoreData-SQLite-Stores or use a standalone Mac app. You will need to write both of those yourself.
First, if you can package the data with the app that would be ideal.
However, assuming you cannot do that then I would do then following:
Once the data is downloaded break it into multiple files before import.
Import on a background thread, one file at a time.
Once a file has been imported and saved, delete the import file.
On launch, look for those files waiting to be processed and pick up where you left off.
Ideally sending the data with the app would be far less work but the second solution will work and you can fine-tune the data break up during development.
I solved a similar problem by putting the insert processing in a background thread. But first I created a progress alert so the user couldn't manipulate the data store while it was inserting the entries.
This is basically the ViewControllers viewDidLoad
- (void)viewDidLoad
{
[super viewDidLoad];
NSError *error = nil;
if (![[self fetchedResultsController] performFetch:&error]) {
NSLog(#"Unresolved error %#, %#", error, [error userInfo]);
abort();
}
// Only insert those not imported, here I know it should be 2006 entries
if ([self tableView:nil numberOfRowsInSection:0] != 2006) {
// Put up an alert with a progress bar, need to implement
[self createProgressionAlertWithMessage:#"Initilizing database"];
// Spawn the insert thread making the app still "live" so it
// won't be killed by the OS
[NSThread detachNewThreadSelector:#selector(loadInitialDatabase:)
toTarget:self
withObject:[NSNumber numberWithInt:[self tableView:nil
numberOfRowsInSection:0]]];
}
}
The insert thread was done like this
- (void)loadInitialDatabase:(NSNumber*)number
{
NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
int done = [number intValue]+1; // How many done so far
// I load from a textfile (csv) but imagine you should be able to
// understand the process and make it work for your data
NSString *file = [NSString stringWithContentsOfFile:[[NSBundle mainBundle]
pathForResource:#"filename"
ofType:#"txt"]
encoding:NSUTF8StringEncoding
error:nil];
NSArray *lines = [file componentsSeparatedByString:#"\n"];
float num = [lines count];
float i = 0;
int perc = 0;
for (NSString *line in lines) {
i += 1.0;
if ((int)(i/(num*0.01)) != perc) {
// This part updates the alert with a progress bar
// setProgressValue: needs to be implemented
[self performSelectorOnMainThread:#selector(setProgressValue:)
withObject:[NSNumber numberWithFloat:i/num]
waitUntilDone:YES];
perc = (int)(i/(num*0.01));
}
if (done < i) // keep track of how much done previously
[self insertFromLine:line]; // Add to data storage...
}
progressView = nil;
[progressAlert dismissWithClickedButtonIndex:0 animated:YES];
[pool release];
}
It's a bit crude this way, it tries to init the data storage from where it left of if the user happend to stop it the previous times...
I had a similar problem importing many objects into CoreData. Initially i was doing a save on the managed object context after every object i wished to create & insert.
What you should do is create/initialize each object you want to save in CoreData, and after you have looped through all your remote data + created the objects, do a managed object context save.
I guess you could look at this as doing doing a transaction in a SQLite database: begin transaction, do lots of inserts/updates, end transaction.
if this still is too lengthy, just thread the darn task and prevent user interaction until complete
Is there any way you can pack the data ahead of time - say during development? And when you push the app to the store, some of the data is already there? That'll cut down on the amount of data you have to pull, thus helping to solve this issue?
If the data is time sensitive, or not ready, or for whatever reason you can't do that, could you compress the data using zlib compression before you ship it over the network?
Or is the problem that the phone dies doing 5K+ inserts?
I imagine you aren't showing all 5K records to the client? I'd recommend doing all of the aggregation you need on the server, and then only sending the necessary data to the phone. Even if this involves generating a few different data views, it'll still be orders of magnitude faster than sending (and then processing) all those rows in the iPhone.
Are you also processing the data in a separate (non event/ui) thread?
Any chance you can setup your server side to expose a RESTful web service for processing your data? I had a similar issue and was able to expose my information through a RESTful webservice. There are some libraries on the iphone that make reading from a webservice like that very easy. I chose to request JSON from the service and used the SBJSON library on the iphone to quickly take the results I got and convert them to dictionaries for easy use. I used the ASIHTTP library for making the web requests and queueing up follow up requests and making them run in the background.
The nice thing about REST is that it a built in way for you to grab batches of information so that you don't need to arbitrarily figure out how to break up your files you want to input. You just setup how many records you want to get back, and the next request you skip that many records. I don't know if that is even an option for you, so I'm not going into a lot of code examples right now, but if it is possible, it may be a smooth way to handle it.
Lets accept that Restful (lazy loading) is not an option... I understand you want to replicate. If the load problem is of the type 'less and less rows loading in more and more time) then in psuedo code...
[self sQLdropIndex(OffendingIndexName)]
[self breathInOverIP];
[self breathOutToSQLLite];
[self sQLAddIndex(OffendingIndexName)]
This should tell you lots.
I work on an app that regularly has to process 100K inserts, deletes, and updates with Core Data. If it is choking on 5K inserts, there is some optimization to be done.
Firstly, create some NSOperation subclass for processing the data. Override its -main method to do the processing. This method is, however, not guaranteed to run on the main thread. Indeed, its purpose is to avoid executing costly code on the main thread which would affect the user experience by making it freeze up grossly. So within the -main method, you need to create another managed object context which is the child of your main thread's managed object context.
- (void)main
{
NSManagedObjectContext *ctx = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
[ctx setPersistentStoreCoordinator:mainManagedObjectContext.persistentStoreCoordinator];
[ctx setUndoManager:nil];
// Do your insertions here!
NSError *error = nil;
[ctx save:&error];
}
Given your circumstances, I don't believe you need an undo manager. Having one will incur a performance penalty because Core Data is tracking your changes.
Use THIS context to perform all of your CRUD actions in the -main method, then save that managed object context. Whatever owns your main thread's managed object context must register to respond to the NSNotification named NSManagedObjectContextDidSaveNotification. Register like so:
[[NSNotificationCenter defaultCenter] addObserver:self selector:#selector(mocDidSaveNotification:) name:NSManagedObjectContextDidSaveNotification object:nil];
Then define that selector:
- (void)mocDidSaveNotification:(NSNotification *)notification
{
NSManagedObjectContext *ctx = [notification object];
if (ctx == mainManagedObjectContext) return;
[mainManagedObjectContext mergeChangesFromContextDidSaveNotification:notification];
}
When all of this comes together, it will allow you to perform long-running operations on background threads without blocking the UI thread. There are several variations of this architecture, but the central theme is this: processing on BG thread, merge on main thread, update your UI. Some other things to keep in mind: (1) keep an autorelease pool around during your processing and drain it every so often to keep your memory consumption down. In our case, we do it every 1000 objects. Adjust for your needs, but keep in mind that draining can be expensive depending on the amount of memory required per object, so you don't want to do it too often. (2) try to pare your data down to the absolute minimum that you need to have a functional app. By reducing the amount of data to parse, you reduce the amount of time required to save it. (3) by using this multithreaded approach, you can concurrently process your data. So create 3-4 instances of your NSOperation subclass, each of which processes only a portion of the data so that they all run concurrently, resulting in a smaller amount of real time consumed for parsing the data set.