I have a fetchRequest which takes up to 4-5 seconds to finish. Since it is part of a search-as-you-type solution, is there any way to abort a fetchRequest?
I use a timer set to start searching my database after 600ms after the user ended typing. So there is a possibility that a new search has to start before the old one has finished.
I haven't found any methods for the NSMangedObjectContext that seem to be right. Is simply setting the old fetchRequest = nil the way to go? Or is there still something going on in the background?
Any ideas?
Thanks in advance!
PS: I'm also trying to enhance my query speed. Maybe someone has an idea for that too:
https://stackoverflow.com/questions/4695729/query-performance-with-large-database
Better way is a place limit for fetch request objects.
(void)setFetchLimit:(NSUInteger)limit
Parameters
limit
The fetch limit of the receiver. 0 specifies no fetch limit.
Discussion
Special Considerations
If you set a fetch limit, the framework makes a best effort, but does not guarantee, to improve efficiency. For every object store except the SQL store, a fetch request executed with a fetch limit in effect simply performs an unlimited fetch and throws away the unasked for rows.
Assuming you're using a UITextField for the text entry, why don't you move your fetchRequest logic to the textField:shouldChangeCharactersInRange:replacementString: (UITextField) delegate method?
This method is called every time a user enters or deletes a character from the textfield, so it is the perfect place to check for a minimum number of characters before firing off a fetch request, as well as changes to the text that would require you to set the existing fetchrequest to nil and start a new one.
I don't think you can, since it almost certainly ties up the thread doing the fetch. The last time I needed to do something like this, I spawned a background thread, with a basic condvar (NSCondition) to signal when a new input was available, and -performSelectorOnMainThread:... to signal when the output was ready. This means that the background thread will continue to work on out-of-date inputs for a while before picking up the new "most recent" input.
You can probably do a similar thing with NSOperation/NSOperationQueue by cancelling all operations on the queue (representing old inputs) before adding a new one (representing the latest input).
Since NSMO/NSMOC isn't thread-safe, you probably want to pass the set of (the first few) MOIDs instead.
Related
I'm hoping someone can suggest a good technique for sorting Gmail threads by date without needing to get details on potentially thousands of threads.
Right now I use threads.list to get a list of threads, and I'm using whatever order they're returned in. That's mostly correct in that it returns threads in reverse chronological order. Except that chronology is apparently determined by the first message in a thread rather than the thread's most recent message.
That's fine for getting new messages at the top of the list but it's not good if someone has a new reply to a thread that started a few days ago. I'd like to put that at the top of the list, since it's a new message. But threads.list leaves it sorted based on the first message.
I thought the answer might be to sort threads based on historyId. That does sort by the most recent message. But it's also affected by any other thread change. If the user updates the labels on a message (by starring it, for example), historyId changes. Then the message sorts to the top even though it's not new.
I could use threads.get to get details of the thread, and do more intelligent sorting based on that. But users might have thousands of threads and I don't want to have to make this call for every one of them.
Does anyone have a better approach? Something I've missed?
I'm not a developer and never used the API before but I just readed the API documentation and it doesn't seem to have the functionality you want.
Anyway, this is what I understood in your question:
You want is organize threads by the latest message in each one.
I thought you could use a combination of users.messages and threads.list. In users.messages you'll have the ThreadID:
threadId string The ID of the thread the message belongs to.
The method would be using the date of user.messages to organize the latest messages from newer to old, then recursively obtain their original threads by threadId and print the threads with threads.list by their latest received message.
With this method you'll avoid recursion in each thread saving resources and time.
I don't know how new messages are affected by labels or starring, you'll have to find out that.
I apologize in advance if my answer isn't correct or missleading.
I have a CQRS/ES application where some of the views are populated by events from multiple aggregate roots.
I have a CashRegisterActivated event on the CashRegister aggregate root and a SaleCompleted event on the Sale aggregate root. Both events are used to populate the CashRegisterView. The CashRegisterActivated event creates the CashRegisterView or sets it active in case it already exists. The SaleCompleted event sets the last sale sequence number and updates the cash in the drawer.
When two of these events arrive within milliseconds, the first update is overwritten by the last one. So that's a lost update.
I already have a few possible solutions in mind, but they all have their drawbacks:
Marshal all event processing for a view or for one record of a view on the same thread. This works fine on a single node, but once you scale out, things start to get complex. You need to ensure all events for a view are delivered to the same node. And you need to migrate to another node when it goes down. This requires some smart load balancer which is aware of the events and the views.
Lock the record before updating to make sure no other threads or nodes modify it in the meantime. This will probably work fine, but it means giving up on a lock-free system. Threads will set there, waiting for a lock to be freed. Locking also means increased latency when I scale out the data store (if I'm not mistaken).
For the record: I'm using Java with Apache Camel, RabbitMQ to deliver the events and MariaDB for the view data store.
I have a CQRS/ES application where some of the views in the read model are populated by events from multiple aggregate roots.
That may be a mistake.
Driving a process off of an isolated event. But composing a view normally requires a history, rather than a single event.
A more likely implementation would be to use the arrival of the events to mark the current view stale, and to use a single writer to update the view from the history of events produced by the aggregate(s) concerned.
And that requires a smart messaging solution. I thought "Smart endpoints and dumb pipes" would be a good practice for CQRS/ES systems.
It is. The endpoints just need to be smart enough to understand when they need histories, or when events are sufficient.
A view, after all, is just a snapshot. You take inputs (X.history, Y.history), produce a snapshot, write the snapshot into your view store (possibly with meta data describing the positions in the histories that were used), and you are done.
The events are just used to indicate to the writer that a previous snapshot is stale. You don't use the event to extend the history, you use the event to tell the writer that a history has changed.
You don't lose updates with multiple events, because the event itself, with all of its state, is captured in the history. It's the history that is used to build the event-sourced view.
Konrad Garus wrote
... handling events coming from a single source is easier, but more importantly because a DB-backed event store trivially guarantees ordering and has no issues with lost or duplicate messages.
A solution could be to detect the when this situation happens, and do a retry.
To do this:
Add to each table the aggregate version number which is kept up to date
On each update statement add the following the the where clause "aggr_version=n-1" (where n is the version of the event being processed)
When the result of the update statement is that no records where modified, it probably means that the event was processed out of order and a retry strategy can be performed
The problem is that this adds complexity and is hard to test. The performance bottleneck is very likely in the database, so a single process with a failover solution will probably be the easiest solution.
Although I see you ask how to handle these things at scale - I've seen people recommend using a single threaded approach - until such times as it actually becomes a problem - and then address it.
I would have a process manager per view model, draw the events you need from the store and write them single threaded.
I combined the answers of VoiceOfUnreason and StefRave into something I think might work. Populating a view from multiple aggregate roots feels wrong indeed. We have out of order detection with a retry queue. So an event on an aggregate root will only be processed when the last completely processed event is version n-1.
So when I create new aggregate roots for the views that would be populated by multiple aggregate roots (say aggregate views), all updates for the view will be synchronised without row locking or thread synchronisation. We have conflict detection with a retry mechanism on the aggregate roots as well, that will take care of concurrency on the command side. So if I just construct these aggregate roots from the events I'm currently using to populate the aggregate views, I will have solved the lost update problem.
Thoughts on this solution?
at the heart of it, my app will ask the user for a bunch of numbers, store them via core data, and then my app is responsible for showing the user the average of all these numbers.
So what I figure I should do is that after the user inputs a new number, I could fire up a new thread, fetch all the objects in a NSFetchDescription instance and call it on my NSManagedObjectContext, do the proper calculations, and then update the UI on the main thread.
I'm aware that the rule for concurrency in Core Data is one thread per NSManagedObjectContext instance so what I want to know is, do you I think can what I just described without having my app explode 5 months down the line? I just don't think it's necessary to instantiate a whole a new context just to do some measly calculations...
Based on what you have described, why not just store the numbers as they are entered into a CoreData model and also into an NSMutableArray? It seems as though you are storing these for future retrieval in case someone needs to look at (and maybe modify) a previous calculation. Under that scenario, there is no need to do a fetch after a current set of numbers is entered. Just use a mutable array and populate it with all the numbers for the current calculation. As a number is entered, save it to the model AND to the array. When the user is ready to see the average, do the math on the numbers in the already populated array. If the user wants to modify a previous calculation, retrieve those numbers into an array and work from there.
Bottom line is that you shouldn't need to work with multiple threads and merging Contexts unless you are populating a model from a large data set (like initial seeding of a phonebook, etc). Modifying a Context and calling save on that context is a very fast thing for such a small change as you are describing.
I would say you may want to do some testing, especially in regard to the size of the data set. if it is pretty small, the sqlite calls are pretty fast so you may get away with doing in on the main queue. But if it is going to take some time, then it would be wise to get it off the main thread.
Apple introduced the concept of parent and child managed object contexts in 2011 to make using MO contexts on different threads easier. you may want to check out the WWDC videos on Core Data.
You can use NSExpression with you fetch to get really high performance functions like min, max, average, etc. here is a good link. There are examples on SO
http://useyourloaf.com/blog/2012/01/19/core-data-queries-using-expressions.html
Good luck!
Like the native iPhone Messages app, I want to code AcaniChat to return the last 50 messages sorted chronologically. Let's say there are 200 messages total in Core Data.
I know I can use fetchOffset=150 & fetchLimit=50 (Actually, do I even need fetchLimit in this case since I want to fetch all the way to the end?), but can I fetch the last 50 messages without first having to fetch the messages count? For example, with Redis, I could just set fetchOffset to -50.
Reverse the sort order, and grab the first 50.
EDIT
But then, how do I display the messages in chronological order? I'm
using an NSFetchedResultsController. – MattDiPasquale
That wasn't part of your question now, was it ;-)
Anyhow, the FRC is not used directly. Your view controller is asked to provide the information, and it then asks the FRC. You can do simple math to transform section/row to get the reverse order.
You could also use a second array internally that has a copy of the objects in the FRC, but with a different sort ordering. That's simple as well.
More complex, but more "academically interesting" is using a separate MOC with custom fetch parameters.
However, before I went too far down either path, I'd want to know what's so wrong with querying the count of objects. It's actually quite fast.
Until I had proof from Instruments that it's the bottleneck that's killing my app, I'd push for the simplest solution possible.
I'm working on an app that uses Core Data and NSFetchedResultsController. A major component of the app is the filtering down items in an indexed table view based on a set of 15 or so pre-defined switches that correspond to a property or relationship of my Managed Objects. In most of my situations, I'm searching through a set of around 300-400 objects, so caching/performance is not an issue. Everything is very snappy with no caching required.
However, there is a part of my app that basically searches through all objects in my CD database (~15,000 items). Here, I'm trying to implement caching on the NSFetchedResultsController to improve performance. The 'cacheString' property for the NSFetchedResultsController is simply the predicate's string value. Whenever the user toggles a filter switch, I create a new predicate, create a new NSFetchedResultsController, and set the cache to the new predicate's string value. The first hit to get all the items (unfiltered) takes ~7 seconds, with subsequent hits taking less than one.
What's strange, though - and here's my problem - is that once I proceed to the 'next step' of the table view (I push a new view controller to the nav controller, passing it a reference to the NSFetchedResultsController's fetchedObjects), performance drops considerably. This next view is essentially a different representation (a horizontally paging scroll view) of the previous view's table list with one item on the screen at once. When I page from one item to the next, accessing the previous or next object in the fetchedObjects array locks up the phone for about 5 seconds. The 'lock up' duration increases the further you go into the fetchedObjects array. If 'i == 0', there is no perceivable lag. If 'i == 10,000', it takes about 15 seconds to access the next object. Nuts! If I disable caching (or it's a query that wasn't cached so it needed to pull fresh results), everything except for the initial filter query is fast and snappy with zero lag.
Does enabling caching ONLY cache indexing info for a table view and not the fetched objects themselves?
I'm not sure what the deal is here. I hope I explained this well enough - let me know if you want to see some code or need additional info.
Thanks!
Billy
Alright, I've found out what my problem was...
Basically, asking my NSFetchedResultsController for a managedObject via objectAtIndexPath: is IMMENSELY faster than going directly to the fetchedObjects array and asking for objectAtIndex: (which, of course, is what I was doing), especially as your index gets into the thousands. I'm not sure 100% why that is, though. I'm guessing NSFetchedResultsController does some fancy stuff to efficiently pull out single objects rather than going straight to the raw data. So, I don't think the caching had anything to do with my performance issue.
Thanks to those who checked out my question. I hope this helps anyone else having similar issues.