CoreData: efficient way to fetch and relate entities

CoreData: efficient way to fetch and relate entities - iphone

I have a CoreData database full of objects that relate to each other.
EntityFoo - attribute: uniqueId, relationship: one-to-many to other EntityFoo objects, relationship: one-to-many to other EntityUnresolvedRelationship objects
EntityUnresolvedRelationship - attribute: uniqueId, inverseRelationship: back to EntityFoo object
When my iPhone application starts, I get the information for these objects from a web service. I parse the web service response and create the Foo entities. At the time I download and parse the data and put the Foo objects into CoreData I don't want to take the time to find the other EntityFoo objects that need to be related to this object, and it is likely that I do not yet have a Foo object to relate yet if it has not yet been downloaded and parsed, so I quickly make an UnresolvedRelationship object for the relationship and store the uniqueId in it so that I can resolve the relationship for this object later.
Now I am trying to figure out the most efficient way to walk through all of the Foo or UnresolvedRelationship objects and create the proper CoreData relationships between all of the Foo objects. In other words, in the end I want to have no UnresolvedRelationship objects...this was just temporary...the Foo objects will only have relationships to each other.
There could be 15,000 plus Foo objects.
Is there a good way to fetch all of the Foo objects and all of the UnresolvedRelationship objects in a way that I can walk one of the arrays and quickly find the matching entity in the other array by it's uniqueId so that I can setup the CoreData relationship?
Anyway, would love any pointers.

you should be doing this as part of the parsing of the data coming from the web service. That is going to be the most efficient way to do it. The way you are approaching this is that you are guessing that the relationship lookup is going to be slow. That is a bad approach to optimization.
Fetching the objects and linking them is not going to be slow unless you cause them to be realized (faulted) into memory. If you are faulting 15K objects, it is going to be slow anyway.
Set up you relationships while parsing, use autorelease pools effectively during your parsing and reset your context at consistent intervals during the parsing to keep memory down. That is the best option.

Related

Perform join in core data

In my app I have two entities, User and Meetings. What I want is a list of User who have meetings today.
Also, I haven't added relationship between both the entities. Is there any way through which I can query both the entities in a single fetch request. Or is there any other way.
Please help me to solve this in best possible way
Thanks in advance

Core Data tries to map objects from the OOP-world into tables and rows from the rDBMS-world and back. This is called a object-relational mapper (ORM). Even this looks very easy, because concepts seems to be similar, it is a difficult task. One called it the "Vietnam of information technology".
However, at some point things do not go together. This is called the object-relational impedance mismatch (ORIM). At this point one has to decide, whether he takes the OOP-way or the rDBMS-way. Resolving relationships is one of this points.
Core Data decided to do this the OOP-way: Relationships are treated as relationships between "usual" objects. This has two consequences:
You do not join anything. In OOP objects are not joined. So in Core data objects are not joined. (However, they have some features in a fetch request with dictionaries, but this is not the usual way to access data in Core Data.)
To do the job, Core Data needs to know the relationships between objects. You have to set the relationships.

How to deal with Core Data retain cycles

The core data guidelines recommend that you model your relationships with an inverse. No problems there.
Interestingly though if you Load an object A that has a to many relationship to B and walk the object graph you end up with a retain cycle and the memory is never freed.
For a simple object graph you can just call refreshObject:mergeChanges: on A to re-fault the object so that relationships are no longer strong references.
If you have a complicated object graph though this is a pain because you need to call it on every object you have touched. It seems like a pretty important consideration when using core data yet there is only one paragraph on this topic in Apples documentation.
I am just wondering how other people handle this? A long running app would slowly just consume more and more memory without some sort of manual process to force objects to revert to faults.
Are there any known patterns for dealing with this. I'd imagine so since lots of people use Core Data I just can't find any recommendations

You are ignoring several aspects of core data when making your assertions. If you fetch an object, let's say object A, which has a one-to-many relationship to object B, when you fetch A, you will have all the objects on B which are related to A. A one to many relationship creates the list of objects related to A and contains them on an NSSet property of your NSManagedObject subclass. Note that these objects are in a faulted state, and the memory footprint from this is insignificant. If you manipulate the objects in the relationship, core data will unfault these objects when necessary. You do not have to do anything to get this behavior. If you want to trigger the faulting behavior yourself to send the objects to fault again, you can use refreshObject:mergeChanges:. If you do not send them back to fault, the faulting behavior will be trigger again eventually.

What is the most efficient way to remove all instances in an entity in Core Data?

I found from this post I can remove all instances of an entity by fetching them all and deleting them all.
Isn't there any more efficient way to do removal? My consideration is I will have thousand of records within that entity.

There's no more efficient way, because CoreData is an ORM layer, not a database. Therefore you deal with objects and if you want them gone, you have to delete them.
A trick you may want to investigate is creating a parent object that would have a one-to-many relationship with the objects to delete. You could basically have only one of those that points to every entry in your big table. Set the cascade delete option on the relationship in your model. Then when comes time to purge, you just delete the parent object. Because of lazy loading, it won't try to load your other objects.
This being said, I haven't tried it myself, but it seems like a viable option.

In a special case where all instances of this entity are self-contained, it would be quicker to delete the backing file and re-initialize the management objects. This only works if your data can be arranged so that the temporary stuff is within its own store.
Otherwise, you'd probably get better results by using direct database access instead of core data.

iPhone: How to manage Core Data relationships by foreign keys

I have an app working with databases on both server side and iOS client side. And I use a HTTP services to sync between SQL Server on server side and Core Data on iPhone.
I have some Core Data objects like this:
ProductGroup
Attributes:
id
Relationships:
products
Product
Attributes:
id
productGroupId
Releationships:
productGroup
Due to the limit of the server, I can't use incremental sync. When I sync my data, (for example) I have to delete all ProductGroup objects, get response from server, then create new ones(and some old ones again).
The problem is, if I have a productA belongs to productGroupB, usually I can do productA.productGroup, but after I delete productGroupB and create another one with the same content, the relationship is lost.
So I am wandering is there any way to manage relationships by FKs, like the Entity Framework in .NET, so I can still find the object on the other end of the relationship after re-create.

You lose the relationship when you delete the ProductGroup objects because Core Data isn't SQL. In the case of relationships, Core Data cares nothing about the attributes of the object on the other side of the relationship, it just targets a specific object. You can have an arbitrary number of objects with the exact same attributes but different relationships and the objects will be completely distinct. A Core Data relationship is not an SQL join or key but a persisted pointer-like reference to a specific managed object. Delete the object and the pointer has to go as well.
To accomplish what you want, you could use a fetched property which would fetch on the Product.id attribute dynamically. However, that is a fairly clumsy way of doing things. You shouldn't have to resort to a fetched property in this instance.
I think you need to rethink your design. I have never seen a case where you had to delete an every instance of an entity/class just to add or remove objects. As a practical matter, you can't actually do that in one go. Instead you have to fetch the objects and then delete them one-by-one. You might has well check each object for if it needs to be deleted or updated while you are at it.
It sounds like you receive a great glob of SQL format data from the server and you think you have to build the object graph from scratch. You really shouldn't have to. You have to parse the data to create new ProductGroup objects anyway, so you should use the results of that parsing to alter the existing ProductGroup objects.
In pseudo-code it would look like:
Add a "synced" flag to ProductGroup entity in the data model
Set "synced" of every ProductGroup object to "false"
Extract data for a ProductGroup from server glob
Using data fetch for an existing ProductGroup object
If extracted data matches and existing ProductGroup object
update existing ProductGroup object
set synced of ProductGroup object to true
else
create new ProductGroup object with data
set synced of new ProductGroup object to true
Delete all ProductGroup objects where synced == false
The important thing to remember here is that you are dealing with objects and not tables, columns, rows or joins. People skilled in SQL often assume that Core Data is just an object wrapper around SQL. It is not. It is an object graph manager that may or may not use SQL far behind the scenes to persist (freeze dry) the object graph to disk.
You have to think in objects always. The intuitions you've developed for working with SQL are more likely to lead you astray than help you with Core Data.

Faulting a CoreData relationship when fetching the main entity

I have an entity with a number of to-many relationships. I present certain properties of the entity in a tableview, using a NSFetchedResultsController. Of all the relationships the entity has, the values of only 1 of the relationships are displayed (they are currently faulted in the cellforrowat... method). It seems to me that this could have a performance impact. Is it possible to fault a specific relationship at the time of creating the Fetch request, so that CoreData does not have to fetch the values when the table is being scrolled?

I'm not sure that I understand the data model you're describing. If you are only displaying members of one of your entity's to-many relationships as the content for the table's rows, then you can fetch only the properties on display in each of the visible rows using -setPropertiesToFetch: on your fetch request, like in the following example:
NSArray *propertiesToFetch = [[NSArray alloc] initWithObjects:#"title", #"thumbnailImage", nil];
[fetchRequest setPropertiesToFetch:propertiesToFetch];
[propertiesToFetch release];
However, if what you're describing is a list of entities, with one of the displayed elements in the table row being from a to-one relationship, you can use -setRelationshipKeyPathsForPrefetching: like Barry suggests. However, in that case I'd suggest denormalizing your data model and moving that property from being within a relationship to being directly within the original entity. Traversing relationships is much more expensive than accessing properties.

First, I would not assume that the default Core Data behavior is less performant than your proposed approach: without data to back up your efforts, optimization is almost certainly going to go awry.
That said, I believe -[NSFetchRequest setRelationshipKeyPathsForPrefetching:] will accomplish what you want.

You could manually fault in objects, but I don't think you'll gain anything. Whether you fault in all the objects at once, or you fault them in one at a time as needed, each object is still going to be faulted in individually.
I have written apps that do exactly what you describe, fault in a large amount of data to display in a table view, and have never noticed a performance penalty. Remember, only the objects that correspond to table view cells that will be displayed will be faulted in.
In general, I'd say don't try to outsmart Core Data. It's got years of performance optimizations in it at this point. While, intuitively, it may seem like faulting in 100 objects would require 100 database queries, this is not necessarily the case.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse