Can Core Data be used for objects with variable schemas? - iphone

I'm implementing a new iPhone app and am relatively new to Cocoa development overall. I am at the stage of choosing how the persistence layer of this app will work, and it looks like I'm basically choosing between Core Data and sqlite3.
The persisted models in this app are intended to have a schema that is loaded at runtime (from some kind of defn file, probably XML). By which I mean, this app is intended to have objects that are user-definable to some extent, e.g. the Customer type (which has certain built-in fields like "name" and "email") can be modified to have extra fields based on the user's specific needs (e.g. a user might want to add a "favourite fruit" field to their Customer type).
Having said that, will Core Data work for an app with a non-baked-in data model like this? I've just started playing around with the Core Data object designer thing in XCode and it seems like this thing wants to work with objects that have fixed fields that are compiled in.
I'm definitely trying to take the path of least resistance here, and I can see the benefits of using an Apple-supplied data framework, but don't want to start down that path if it's going to lock me into a data model that's defined at compile time.

The Core Data data model needs to be defined at compile time, but that does not mean you can't allow for custom fields to be added and used by end users.
It just means that you would define an entity for custom fields and create the fields as objects.
It is best to design a data model that meets your needs rather than think of how you would solve the problem in SQL.

Related

Entity Framework 5 with existing DB, use generated POCO's? Move POCO's into it's own project?

I have a project with an existing database which was initially created for a legacy application. It works fine, but over time quite a few of the tables / fields have been lost or under-utilized, but the historical data MAY be useful someday so they're not going anywhere.
Enter 2012 ('13) and Entity Framework 5, an ORM with built in POCO generation (Nice Add!). So bang.. Get a connection to the Oracle Database, gen. up a context and some POCO's.. suh-weet!! But wait.. my POCO's arent really the POCO's I would like to deal with... There's a bunch of fields which i dont need anymore (not to say I'll NEVER need them, but i can't know for sure), so now i've got these POCO's which are basically bloated table mappers... So what should I do.
I see a few solutions here..
1). I could throw them around and only use the fields that I need.
2). I could get into the Model Surface and start axing the unused fields.
3). "Code-First" approach and tie the objects into the existing DB, it's a large DB though (i'm pretty sure this is possible, right?)
4). Create my own POCO / DTO's in it's own model project and these will essentially become my "domain model", but the mapping back into the context could be painful..
Lastly, do these POCO's / DTO's need to be in their own project?? What is there REALLY to gain.. seeing things like "YAGNI", i feel like it can sit right under the .edmx and never bother anyone..
On a side note, i will be needing a few of these via JSON too, so the whole serializable ability needs to be considered..
Can i just partial class the generated POCO's and only "Attribute" the properties I'll be needing?
anyhow, it'd be great to hear from past experience, or thoughts on the matter..
I could see this being in Programmers, but i figured I'd start it here.
We have a very similar situation, a large legacy DB2 database of which we need small portions of specific tables for our applications.
To do this we used entity framework code first models for the relevant subsections of data we were interested in. This meant we could do a few important things:
remove irrelevant data from the model to make code more discoverable
rename fields inside our model and map them to names that make sense in the app rather than existing column names
reduce the volume of data pulled back by queries (ie our selects dont grab all the extra bits)
where 2 formats of data exist use the modern standard rather than historical format
This works out really well for us, however a couple of things to note:
if you are writing make sure you include all required fields in the model
you can generate you CF classes but you will have to trim them a bit
generating from non mssql can sometimes be more tricky
In terms of json serialisation we do this too however we use a different model for this and use automapper to translate. You should in most cases be able to serialise without needing to add extra attributes but if they are required you can just add them to your pocos alongside any ef attributes.

Need some advice concerning MVVM + Lightweight objects + EF

We develop the back office application with quite large Db.
It's not reasonable to load everything from DB to memory so when model's proprties are requested we read from DB (via EF)
But many of our UIs are just simple lists of entities with some (!) properties presented to the user.
For example, we just want to show Id, Title and Name.
And later when user select the item and want to perform some actions the whole object is needed. Now we have list of items stored in memory.
Some properties contain large textst, images or other data.
EF works with entities and reading a bunch of large objects degrades performance notably.
As far as I understand, the problem can be solved by creating lightweight entities and using them in appropriate context.
First.
I'm afraid that each view will make us create new LightweightEntity and we eventually will end with bloated object context.
Second. As the Model wraps EF we need to provide methods for various entities.
Third. ViewModels communicate and pass entities to each other.
So I'm stuck with all these considerations and need good architectural design advice.
Any ideas?
For images an large textst you may consider table splitting, which is commonly used to split a table in a lightweight entity and a "heavy" entity.
But I think what you call lightweight "entities" are data transfer objects (DTO's). These are not supplied by the context (so it won't get bloated) but by projection from entities, which is done in a repository or service.
For projection you can use AutoMapper, especially its newer feature that I describe here. This allows you to reduce the number of methods you need to provide "for various entities" (DTO's), because the type to project to can be given in a generic type parameter.

Is core data implementing data mapper pattern?

I know that core data should not be considered as ORM but it still offers the functionality that is similar to ORM. Just curious, is it implementing data mapper pattern? I know "The Data Mapper is a layer of software that separates the in-memory objects from the database. Its responsibility is to transfer data between the two and also to isolate them from each other." (Martin Fowler). IMHO context manager handles all SQL stuff into one transaction, so it's very performance wise design and IMHO core data might be considered implementing data mapper pattern.
One year latter, I will contribute with my two cents
I am not an ORM expert and just recently started something using a Data Mapper, but as a long time Core Data user I can say that no. The main objective of this pattern is having a clear cut of a domain object from all database related operations.
Once I start writing unit tests, the first thing I notice is that I must load a database, even if it is just some in memory store, but I do must load one. Also there are no mappers for each class, I have no control about how each relation is stored.
Core Data loads lots of meta information about your object graph and forces some structure to them. Although you can change the persistent store and bake something of your own, you will have lots of restrictions about how to do it, with a clear "relational" feeling to it.
The idea is good, we might say it is some variation of it. Something that I do love is that the save operation is done by the context, not the object itself. So there is some type of separation.
However look at those functions like "awakeFromFetch" or "didSave", both operations are related with the data store, not a plain domain object. A proper Data Mapper pattern would allow you to define those operations for each persistent store, not unified in a single object.
UPDATE:
Funny enough one day after my answer I had to deal with an old CoreData based project and must come back to improve this answer. To make things clear, I do consider that "seems like a pattern" is not enough. For example, implementation of the facade and adapter patterns is quite similar, but you name them differently depending on how you use them.
Is Core Data implementing data mapper?
I must say that my "not quite" should have been "definitely not!"
I have just been very angry because I needed to rename some fields and later add new ones. Although I do know quite well how auto-migrations work with Core Data I forgot how annoying these are.
How many times do you need some new field, rename something, experiment until you get it right.... and every single tiny change requires a full blown database migration? With Data Mappers this never happens because domain objects are perfectly decoupled. You only touch the database to catch up with the domain objects after you finish some new feature. Core Data forces you to bind at every single moment every single detail of your domain objects.
Boy, how sweet life was until I forgot that "tiny" annoyance of Core Data being the exact opposite of what you can achieve with data mappers.

Out-Of-Memory while doing Core Data migration

I'm migrating a CoreData model between two versions of an application. I was storing binary data as blobs in the previous version and I want to take them out of the blobs for performance. My issue is that during the migration it seems that Core Data loads everything into memory which leads to Low Memory Warnings and then to my app being killed.
Apple documentation suggests the following :
http://developer.apple.com/library/mac/documentation/Cocoa/Conceptual/CoreDataVersioning/Articles/vmCustomizingTheProcess.html#//apple_ref/doc/uid/TP40005510-SW9
However, it seems to rely on the fact that the large objects are applied different mapping.
In my case, all the objects are basically the same and the same mapping has to be applied to each of them. I don't see in this case how I could apply their technique.
How should I handle a migration with very large objects ?
I'm guessing that you have a bunch of changes you want to make in addition to pulling the data out of blobs. My suggestion is to do the migration in a few stages. I'm kind of thinking out loud here, so it might be possible to improve on this. This requires you to be using SQLite.
To make this work, you're going to have three versions of your model:
The original model
The model with the attribute removed (and possibly with a special unique ID added--see below)
The model with all of the changes you've made, including the addition of the new entity and relationships replacing the attribute
The reason to do this is that the transition from version 1 to 2 should be doable with an automatic lightweight migration. In that case Core Data doesn't need to load anything into memory--it just issues SQL statements to make the changes directly on the database.
So, you start by setting up your persistent store coordinator using the old model version. Once you've loaded the data, go through all of the objects you're migrating, extract the binary attribute, and write it to disk somehow. You can use a fetch request with batching and regular autorelease pool draining to make sure you don't use up too much memory for temporary objects. Store the data into the directory you get with NSCachesDirectory. You'll obviously want to store the data in a way that lets you relate it back to the object's managedObjectID.
Then, you shut everything down and ask Core Data to migrate the store from version 1 to version 2. See this link for details. Open up the store with version 2.
You might have to add a step where you assign some sort of unique ID to each object, because I'm not sure if Core Data maintains object IDs when it does a non-lightweight migration. If you need to do this, your version 2 model would add a new attribute to the object you're taking the binary data out of that would be either optional or have a default value set. Since lightweight migration shouldn't change the managedObjectIDs, you could at save the mapping of your new unique ID to the managedObjectIDs you saved along with the binary data two paragraphs ago.
Save the data and close the store.
Open the store and do a migration from version 2 to version 3, which should basically be the code you already had written before you posted the question. Once the store is open, add all of the objects you saved from the version 1 store and set up the relationships using the data you saved along the way.
Simple, right?

iPhone and Core Data: how to retain user-entered data between updates?

Consider an iPhone application that is a catalogue of animals. The application should allow the user to add custom information for each animal -- let's say a rating (on a scale of 1 to 5), as well as some notes they can enter in about the animal. However, the user won't be able to modify the animal data itself. Assume that when the application gets updated, it should be easy for the (static) catalogue part to change, but we'd like the (dynamic) custom user information part to be retained between updates, so the user doesn't lose any of their custom information.
We'd probably want to use Core Data to build this app. Let's also say that we have a previous process already in place to read in animal data to pre-populate the backing (SQLite) store that Core Data uses. We can embed this database file into the application bundle itself, since it doesn't get modified. When a user downloads an update to the application, the new version will include the latest (static) animal catalogue database, so we don't ever have to worry about it being out of date.
But, now the tricky part: how do we store the (dynamic) user custom data in a sound manner?
My first thought is that the (dynamic) database should be stored in the Documents directory for the app, so application updates don't clobber the existing data. Am I correct?
My second thought is that since the (dynamic) user custom data database is not in the same store as the (static) animal catalogue, we can't naively make a relationship between the Rating and the Notes entities (in one database) and the Animal entity (in the other database). In this case, I would imagine one solution would be to have an "animalName" string property in the Rating/Notes entity, and match it up at runtime. Is this the best way to do it, or is there a way to "sync" two different databases in Core Data?
Here's basically how I ended up solving this.
While Amorya's and MHarrison's answers were valid, they had one assumption: that once created, not only the tables but each row in each table would always be the same.
The problem is that my process to pre-populate the "Animals" database, using existing data (that is updated periodically), creates a new database file each time. In other words, I can't rely on creating a relationship between the (static) Animal entity and a (dynamic) Rating entity in Core Data, since that entity may not exist the next time I regenerate the application. Why not? Because I have no control how Core Data is storing that relationship behind the scenes. Since it's an SQLite backing store, it's likely that it's using a table with foreign key relations. But when you regenerate the database, you can't assume anything about what values each row gets for a key. The primary key for Lion may be different the second time around, if I've added a Lemur to the list.
The only way to avoid this problem would require pre-populating the database only once, and then manually updating rows each time there's an update. However, that kind of process isn't really possible in my case.
So, what's the solution? Well, since I can't rely on the foreign key relations that Core Data makes, I have to make up my own. What I do is introduce an intermediate step in my database generation process: instead of taking my raw data (which happens to be UTF-8 text but is actually MS Word files) and creating the SQLite database with Core Data directly, I introduce an intermediary step: I convert the .txt to .xml. Why XML? Well, not because it's a silver bullet, but simply because it's a data format I can parse very easily. So what does this XML file have different? A hash value that I generate for each Animal, using MD5, that I'll assume is unique. What is the hash value for? Well, now I can create two databases: one for the "static" Animal data (for which I have a process already), and one for the "dynamic" Ratings database, which the iPhone app creates and which lives in the application's Documents directory. For each Rating, I create a pseudo-relationship with the Animal by saving the Animal entity's hash value. So every time the user brings up an Animal detail view on the iPhone, I query the "dynamic" database to find if a Rating entity exists that matches the Animal.md5Hash value.
Since I'm saving this intermediate XML data file, the next time there's an update, I can diff it against the last XML file I used to see what's changed. Now, if the name of an animal was changed -- let's say a typo was corrected -- I revert the hash value for that Animal in situ. This means that even if an Animal name is changed, I'll still be able to find a matching Rating, if it exists, in the "dynamic" database.
This solution has another nice side effect: I don't need to handle any migration issues. The "static" Animal database that ships with the app can stay embedded as an app resource. It can change all it wants. The "dynamic" Ratings database may need migration at some point, if I modify its data model to add more entities, but in effect the two data models stay totally independent.
The way I'm doing this is: ship a database of the static stuff as part of your app bundle. On app launch, check if there is a database file in Documents. If not, copy the one from the app bundle to Documents. Then open the database from Documents: this is the only one you read from and edit.
When an upgrade has happened, the new static content will need to be merged with the user's editable database. Each static item (Animal, in your case) has a field called factoryID, which is a unique identifier. On the first launch after an update, load the database from the app bundle, and iterate through each Animal. For each one, find the appropriate record in the working database, and update any fields as necessary.
There may be a quicker solution, but since the upgrade process doesn't happen too often then the time taken shouldn't be too problematic.
Storing your SQLite database in the Documents directory (NSDocumentDirectory) is certainly the way to go.
In general, you should avoid application changes that modify or delete SQL tables as much as possible (adding is ok). However, when you absolutely have to make a change in an update, something like what Amorya said would work - open up the old DB, import whatever you need into the new DB, and delete the old one.
Since it sounds like you want a static database with an "Animal" table that can't be modified, then simply replacing this table with upgrades shouldn't be an issue - as long as the ID of the entries doesn't change. The way you should store user data about animals is to create a relation with a foreign key to an animal ID for each entry the user creates. This is what you would need to migrate when an upgrade changes it.