Core Data get sum of values. Fetched properties vs. propagation - iphone

I'm relatively new to Core Data (coming from an SQLite background). Just finished reading the 'Core Data for iOS' book but I'm left with a number of baffling questions when I started modelling an app which has the following model:
'Accounts' entity which has a to-many 'transactions' relationship and a 'startingBalance' property
'Transaction' entity which has a to-many 'payments' relationship (and an inverse to Accounts)
'Payment' entity which contains details of the actual 'amount' paid
For performance reasons I wanted to de-normalize the model and add a 'TotalAmountSpent' property in the 'Accounts' entity (as suggested by the book) so that I could simply keep updating that when something changed.
In practice this seems difficult to achieve with Core Data. I can't figure out how to do this properly (and don't know what the right way is). So my questions are:
a) Should I change the 'TotalAmountSpent' to a Fetched Property instead? Are there performance implications (I know it's loaded lazily but I will almost certainly be fetching that property for every account). If I do, I need to be able to get the total amount spent against the 'startingBalance' for a given period of time (such as the last three days). This seems easy in SQL but how do I do this in Core Data? I read I can use a #sum aggregate function but how do I filter on 'date' using #sum? I also read any change in the data will require refreshing the fetched property. How do I 'listen' for a change? Do I do it in 'Payment' entity's 'willSave' method?
b) Should I use propagation and manually update 'TotalAmountSpent' each time a new payment gets added to a transaction? What would be the best place to do this? Should I do it in an overridden NSManagedObject's 'willSave' method? I'm then afraid it'll be a nightmare to update all corresponding transactions/payments if the 'startingBalance' field was updated on the account. I would then have to load each payment and calculate the total amount spent and the final balance on the account. Scary if there are thousands of payments
Any guidance on the matter would be much appreciated. Thanks!

If you use a fetched property you cannot then query on that property easily without loading the data into memory first. Therefore I recommend you keep the actual de-normalized data in the entity instead.
There are actually a few ways to easily keep this up to date.
In your -awakeFromFetch/-awakeFromInsert set up an observer of the relationship that will impact the value. Then when the KVO (Key Value Observer) fires you can do the calculation and update the field. Learning KVC and KVO is a valuable skill.
You can override -willSave in the NSManagedObject subclass and do the calculation on the save. While this is easier, I do not recommend it since it only fires on a save and there is no guarantee that your account object will be saved.
In either case you can do the calculation very quickly using the KVC Collection Operators. With the collection operators you can do the sum via a call to:
NSNumber *sum = [self valueForKeyPath:#"transactions.#sum.startingBalance"];

Related

Is it possible to fetch the data of only one attribute of an Entity in Core Data?

I have an entity in Core Data with multiple attributes. In order to increase the performance of the app, I would like to fetch only one attribute of that entity. Is that possible to do and if so, then how? Or should I just use predicates to fetch the entities that I need and from them access the values of their attributes? Thanks.
It depends on a few things; how many entities are you fetching, do you ever want anything else, what is your real performance problem?
First of all use Instruments to make sure that your problem is actually where you think it is. Core data uses faulting and batching to make it very memory and performance efficient. An entity's attribute data is not brought into memory until it is accessed.
If you really want to only fetch a single attribute from your entities then you can make a fetch request with the propertiesToFetch value set to the attributes you care about. If you do this with a managed object resultType, then AFAIK I know this will use more memory, as it will make all the result objects be a partial fault (with those properties populated) rather than full faults.
If you use the dictionary resultType, then you'll get back no managed objects at all, just an array of dictionaries with the relevant attribute populated.
You can get the single property. Here is the Apple's way

Core Data Fetch

I have an entity, and I want to fetch a certain attribute.
For example,
Let's say I have an entity called Food, with multiple attributes. I want to select all categories, which is an attribute on each food item. What's the best way to accomplish this in Core Data?
Just run your fetch request and then use valueForKey: to extract all of the attribute values. If your model contains lots of objects, you can set the fetch limit and offset (and sort descriptor) to page through the items. When doing this you should also set the fetch request to not return objects as faults.
Just remembered there is an alternative. You can set the properties to fetch and then set the result type to NSDictionaryResultType. You still need to do the iteration but this will return the minimum data possible.
EDIT: I think I misunderstood your question. It seems that you only want to fetch a property of an object, not the object itself (e.g. the attribute but not the entity)? I don't believe core data is going to work that way...it's an object graph rather than a database, as the person above mentioned. Research how Core Data "faults", automatically retrieving dependent objects as they are needed. I left the advice below in case it still applies, though I'm not sure it will.
You can add a predicate to your search in order to fetch only objects which meet certain criteria; it works like an "if" statement for fetching. Here's Apple's documentation:
https://developer.apple.com/library/mac/#documentation/Cocoa/Reference/Foundation/Classes/NSPredicate_Class/Reference/NSPredicate.html
And a tutorial:
http://www.peterfriese.de/using-nspredicate-to-filter-data/
All that said, the necessity really depends on how many objects you're fetching. If it's not causing any performance hit, there's not necessarily anything wrong with fetching a few un-needed objects. Solve problems, in other words--don't "optimize" things that were working fine. If your model contains a ton of objects, though, it could be expensive to fetch them all and you would want to use a predicate.
You can only fetch whole objects, but you can only fetch objects that have a perticlar attribute using nspredicate in your fetch request. See the code snippet in Xcode in the snippet section. Look for fetch request in the find field and drag out the snippet with the nspredicate code. You can set the predicate to only find the objects that satisfy this predicate. Hope this helps!

Using NSFetchedResultsController Without UITableView

Is it wrong to use an NSFetchedResultsController purely for data management, i.e., without using it to feed a UITableView?
I have a to-many relationship in a Core Data iPhone app. Whenever data in that relationship changes, I need to perform a calculation which requires that data to be sorted. In Apple's standard Department/Employees example, this would be like determining the median salary in a given Department. Whenever an Employee is added to or removed from that Department, or an Employee's salary changes, the median calculation would need to be performed again.
Keeping data sorted and current and getting notifications when it changes sounds like a great job for NSFetchedResultsController. The only "problem" is that I'm not using a UITableView. In other words, I'm not displaying sorted Employees in a UITableView. I just want an up-to-date sorted array of Employees so I can analyze them behind the scenes. (And, of course, I don't want to write a bunch of code that duplicates much of NSFetchedResultsController.)
Is it a bad idea to use an NSFetchedResultsController purely for data management, i.e., without using it to feed a UITableView? I haven't seen this done anywhere, and thought I might be missing something.
I would not call it bad but definitely "heavy".
It would be less memory and CPU to watch for saves via the NSManagedObjectContextDidSaveNotification and do the calculation there. The notification will come with three NSArray instances in its userInfo and you can then use a simple NSPredicate against those arrays to see if any employee that you care about has changed and respond.
This is part of what the NSFetchedResultsController does under the covers. However you would be avoiding the other portions of the NSFetchedResultsController that you don't care about or need.
Heavy
NSFetchedResultsController does more processing than just watch for saved objects. It handles deltas, makes calls to its delegates, etc. I am not saying it is bad in any way shape or form. What I am saying is that if you only care about when objects have changed in your relationship, you can do it pretty easily by just watching for the notifications.
Memory
In addition, there is no reason to retain anything since you are already holding onto the "Department" entity and therefore access its relationships. Holding onto the child objects "just in case" is a waste of memory. Let Core Data manage the memory, that is part of the reason for using it.
There's nothing wrong with using NSFetchedResultsController without a view. Your use case sounds like a good reason to not re-invent the wheel.
To me, this sounds like an appropriate use of NSFetchedResultController. it might be a bit overkill, as its primary use IS to help populate and keep up to date tableViews, but if you are willing to put up with the added complexity, there is no reason to not use it as such. Correct use of notifications would be the other method and it is just as complex i would estimate.

iphone SDK: Arbitrary tableview row reordering with core data

What is the best way to implement arbitrary row reordering in a tableview that uses core data? The approach that seems obvious to me is to add a rowOrder attribute of type Int16 to the entity that is being reordered in the tableview and manually iterate through the entity updating the rowOrder attributes of all the rows whenever the user finishes reordering.
That is an incredibly inelegant solution though. I'm hoping there is a better approach that doesn't require possibly hundreds of updates whenever the user reorders things.
If the ordering is something that the data model should modal and store, then the ordering should be part of the entity graph anyway.
A good, lightweight solution is to create an Order entity that has a one-to-one relationship to the actual entity being ordered. To make updating easy, create a linked-list like structure of the objects. Something like this:
Order{
order:int;
orderedObject<--(required,nullify)-->OrderObject.order
previous<--(optional,nullify)-->Order.next;
next<--(optional,nullify)-->Order.previous;
}
If you create a custom subclass, you can provide an insert method that inserts a new object in the chain and then sends a message down the next relationships and tells each object to increment its order by one then the message to its next. A delete method does the opposite. That makes the ordering integral to the model and nicely encapsulated. It's easy to make a base class for this so you can reuse it as needed.
The big advantage is that it only requires the small Order objects to be in alive in memory.
Edit:
Of course, you can extend this with another linked object to provide section information. Just relate that entity to the Order entity then provide the order number as the one in the section.
There is no better way and that is the accepted solution. Core Data does not have row ordering internally so you need to do it yourself. However it is really not a lot of code.

DDD: Persisting aggregates

Let's consider the typical Order and OrderItem example. Assuming that OrderItem is part of the Order Aggregate, it an only be added via Order. So, to add a new OrderItem to an Order, we have to load the entire Aggregate via Repository, add a new item to the Order object and persist the entire Aggregate again.
This seems to have a lot of overhead. What if our Order has 10 OrderItems? This way, just to add a new OrderItem, not only do we have to read 10 OrderItems, but we should also re-insert all these 10 OrderItems again. (This is the approach that Jimmy Nillson has taken in his DDD book. Everytime he wants to persists an Aggregate, he clears all the child objects, and then re-inserts them again. This can cause other issues as the ID of the children are changed everytime because of the IDENTITY column in database.)
I know some people may suggest to apply Unit of Work pattern at the Aggregate Root so it keeps track of what has been changed and only commit those changes. But this violates Persistence Ignorance (PI) principle because persistence logic is leaking into the Domain Model.
Has anyone thought about this before?
Mosh
This doesn't have to be a problem, some ORM's support lazy lists.
e.g.
You could load the order entity and add items to the Details collection w/o actually materializing all of the other entities in that list.
I think N/Hibernate supports this.
If you are writing your own entity persistence code w/o any ORM, then you are pretty much out of luck, you would have to re-implement the same dirty tracking machinery as ORMappers give you for free.
The entire aggregate must be loaded from database because DDD assumes that aggregate roots ensure consistency within boundaries of aggregates. For these rules to be checed, all necessary data must be loaded. If there is a requirement that an order can be worth no more then $100000 for particular customer, aggregate root (Order) must check this rule before persisting changes. This does not imply that all the exisiting items must be loaded and their value summed up. Order can maintain pre-calculated sum of existing items which is updated on adding new ones. This way checking the business rule requires only Order data to be loaded when adding new items.
I'm not 100% sure about this approach , but I think applying unit of work pattern could be the answer . Keeping in mind that any transaction should be done , in application or domain services , you could populate the unit of work class/object with the objects from the aggregate that you have changed . After that let the UoW class/object do the magic (ofcourse building a proper UoW might be hard for some cases)
Here is a description of the unit of work pattern from here :
A Unit of Work keeps track of everything you do during a business transaction that can affect the database. When you're done, it figures out everything that needs to be done to alter the database as a result of your work.