I have to decide about an important item and I need your help.
I'm facing an huge existing database with a lot of default values on nullable columns.
The team has to build a new MVC4 application on top of it (in fact it is a rewrite of old VB6 application).
I (as a consultant) have 'forced' the use of EF5 to get rid of all stored procedures and migrate to a more modern techology.
Now, after my research, it is clear to me that EF5 doesn't support database default values per default. This is why my inserted records are corrupt (they are inserted because the columns are nullable, but with NULL of course).
Some options came up like using the constructor technique, setting the default values in design on the edmx, or playing around with the xml of the edmx.
Despite, these methods are not usefull for us. Where the constructor technique looks ok for me, it is not feasible to do that for all tables in the DB. I also have a 'njet' from the technical person because he wants to maintain these values on 1 place. Same story for setting the default values in design. The database is also not in our scope (read: as less as possible changes to keep existing applications running).
At this point, I'm not sure it EF is the correct choice for our project.
Is somebody aware of (3th party) tools that can fill in the database default values automatically in the generated xml of the edmx file?
Is there som more info about how this xml is build and if there is a possiblity to interfere in the process?
Is there a good readon why these default values should not be taken? Is this going to change in a later release?
Are there other good practices that can be applied to that problem without having all values dupplicated or a massive workload?
Can I arrange something with my poco generator?
I realize there are already a lot of posts of this topic. Too bad, there is no suitable solution for me since we have already something existing and (with all respect) an old VB6 team that I have to convince.
Thanks for your feedback!
Related
Is there a way to combine code-first and database-first in the same context? We are running into massive development-time performance problems when editing the EDMX file (it takes 1.5 minutes to save). I've moved our non-insert/update/delete UDFs/stored procs to some custom T4 templates that automatically generate model-first code, but I can't seem to get OnModelCreating to be called when EDMX is involved.
Other things we've considered, but won't work for one reason or another:
We can't (reasonably) separate our code to multiple contexts as there is a lot of overlap in our entity relationships. It also seems like quite a people who have gone this route regret it.
We tried having 2 different contexts, but there are quite a few joins between Entities & UDFs. This may be our last hope, but I'd REALLY like to avoid it.
We can't switch to Dapper since we have unfortunately made heavy use of IQueryable.
We tried to go completely to Code-First, but there are features that we are using in EDMX that aren't supported (mostly related to insert/update/delete stored procedure mapping).
Take a look at the following link. I answered another question in a similar fashion:
How to use Repository pattern using Database first approach in entity framework
As I mentioned in that post, I would personally try to switch to a Code First approach and get rid of the EDMX files as it is already deprecated and most importantly, the maintenance effort is considerable and much more complex compared with the Code First approach.
It is not that hard switching to Code First from a Model First approach. Some steps and images down below:
Display all files at the project level and expand the EDMX file. You will notice that the EDMX file has a .TT file which will have several files nested, the Model Context and POCO clases between them as .cs or .vb classes (depending on the language you are using). See image down below:
Unload the project, right click and then edit.
See the image below, notice the dependencies between the context and the TT file
Remove the dependencies, the xml element should look like the image below:
Repeat the procedure for the Model classes (The ones with the model definition)
Reload your project, remove the EDMX file(s)
You will probably need to do some tweeks and update names/references.
I did this a few times in the past and it worked flawlessly on production. You can also look for tools that do this conversion for you.
This might be a good opportunity for you to rethink the architecture as well.
BTW: Bullet point 4 shouldn't be a show stopper for you. You can map/use Stored Procedures via EF. Look at the following link:
How to call Stored Procedure in Entity Framework 6 (Code-First)?
It also seems like quite a people who have gone this route [multiple contexts] regret it.
I'm not one of them.
Your core problem is a context that gets too large. So break it up. I know that inevitably there will be entities that should be shared among several contexts, which may give rise to duplicate class names. An easy way to solve this is to rename the classes into their context-specific names.
For example, I have an ApplicationUser table (who hasn't) that maps to a class with the same name in the main context, but to a class AuthorizationUser in my AuthorizationContext, or ReportingUser in a ReportingContext. This isn't a problem at all. Most use cases revolve around one context type anyway, so it's impossible to get confused.
I even have specialized contexts that work on the same data as other contexts, but in a more economical way. For example, a context that doesn't map to calculated columns in the database, so there are no reads after inserts and updates (apart from identity values).
So I'd recommend to go for it, because ...
Is there a way to combine code-first and database-first in the same context?
No, there isn't. Both approaches have different ways of building the DbModel (containing the store model, the class model, and the mappings between both). In a generated DbContext you even see that an UnintentionalCodeFirstException is thrown, to drive home that you're not supposed to use that method.
mostly related to insert/update/delete stored procedure mapping
As said in another answer, mapping CUD actions to stored procedures is supported in EF6 code-first.
I got here from a link in your comment on a different question, where you asked:
you mentioned that code-first & database-first is "technically possible" could you explain how to accomplish that?
First, the context of the other question was completely different. The OP there was asking if it was possible to use both database-first and code-first methodologies in the same project, but importantly, not necessarily the same context. My saying that it was "technically possible" applies to the former, not the latter. There is absolutely no way to utilize both code-first and database-first in the same context. Actually, to be a bit more specific, let's say there's no way to utilize an existing database and also migrate that same database with new entities.
The terminology gets a bit confused here due to some unfortunate naming by Microsoft when EF was being developed. Originally, you had just Model-first and Database-first. Both utilized EDMX. The only difference was that Model-first would let you design your entities and create a database from that, while Database-first took an existing database and created entities from that.
With EF 4.1, Code-first was introduced, which discarded EDMX entirely and let you work with POCOs (plain old class objects). However, despite the name, Code-first can and always has been able to work with an existing database or create a new one. Code-first, then is really Model-first and Database-first, combined, minus the horrid EDMX. Recently, the EF team has finally taken it a step further and deprecated EDMX entirely, including both the Model-first and Database-first methodologies. It is not recommended to continue to use either one at this point, and you can expect EDMX support to be dropped entirely in future versions of Visual Studio.
With all that said, let's go with the facts. You cannot both have an existing database and a EF-managed database in a single context. You would at least need two: one for your existing tables and one for those managed by EF. More to the point, these two contexts must reference different databases. If there are any existing tables in an EF-managed database, EF will attempt to remove them. Long and short, you have to segregate your EF-managed stuff from your externally managed stuff, which means you can't create foreign keys between entities in one context and another.
Your only real option here is to just do everything "database-first". In other words, you'll have to just treat your database as existing and manually create new tables, alter columns, etc. without relying on EF migrations at all. In this regard, you should also go ahead and dump the EDMX. Generate all your entities as POCOs and simply disable the database initializer in your context. In other words, Code-first with an existing database. I have additional information, if you need it.
Thank you to everyone for the well thought out and thorough answers.
Many of these other answers assume that the stored procedure mappings in EF Code-First work the same, but they do not. I'm a bit fuzzy on this as it's been about 6 months since I looked at it, but I believe as of EF 6.3 code first stored procedures require that you pass every column from your entity to your insert/update stored procedure and that you only pass the key column(s) to your delete procedure. There isn't an option to pick and choose which columns you can pass. We have a requirement to maintain who deleted a record so we have to pass some additional information besides just a simple key.
That being said, what I ended up doing was using a T4 template to automatically generate my EDMX/Context/Model files from the database (with some additional meta-data). This took our developer time experience down from 1.5 minutes to about 5 seconds.
My hope is EF stored procedure mappings will be improved to achieve parody with EDMX and I can then just code-generate the Code-First mappings and remove the EDMX generation completely.
Working on a brand new project from the ground up. That means the data model is in a constant flux, doubly so because things are, inevitably, not as well planned as they should be. Model classes are being created and changed fairly regularly.
The plan was to use the latest version of EF with all the neat code-first stuff in it. But we're constantly tripping over the limitations the framework has in terms of adding or updating tables. The initialization options seem to allow only the complete deletion and re-creation of the database, which isn't really ideal.
I've had a look at the migrations. But this seems a sledgehammer to crack a nut: we don't need to detail every single small change and update with a new migration scaffold.
Are there some better strategies to deal with this? For instance, I started writing some unit tests to pre-populate one of the contexts with some test data, but because this causes the whole Db to drop and re-create, it causes problems with all the other contexts. Or perhaps making use of a custom initialiser to seed the data for us? How can we easily exclude these in production code?
We're also wondering about perhaps abandoning code-first and going back to EDMX diagrams. At least that way changes result in updated SQL commands which can be run directly against the database.
Any suggestions gratefully received.
I think, imho, that:
as the database schema must at least match your model you should/must detail every single change, and code first migration allows that and trace the changes over time
code first migration also allows to migrate the database schema for you
code first migration also allows you to produce sql that allows you to migrate the schema
For these reasons code first is as good (if not better) as the edmx approach
Please take few minutes to implement http://msdn.microsoft.com/en-us/data/jj591621.aspx
One other point, always imho and in a perfect world, if you unit test the business of you model you should not need the DAL, use generic collection. Be aware of different comportement of linq to object vs linq to entities, for example concerning the case sensitivity.
I have a project with an existing database which was initially created for a legacy application. It works fine, but over time quite a few of the tables / fields have been lost or under-utilized, but the historical data MAY be useful someday so they're not going anywhere.
Enter 2012 ('13) and Entity Framework 5, an ORM with built in POCO generation (Nice Add!). So bang.. Get a connection to the Oracle Database, gen. up a context and some POCO's.. suh-weet!! But wait.. my POCO's arent really the POCO's I would like to deal with... There's a bunch of fields which i dont need anymore (not to say I'll NEVER need them, but i can't know for sure), so now i've got these POCO's which are basically bloated table mappers... So what should I do.
I see a few solutions here..
1). I could throw them around and only use the fields that I need.
2). I could get into the Model Surface and start axing the unused fields.
3). "Code-First" approach and tie the objects into the existing DB, it's a large DB though (i'm pretty sure this is possible, right?)
4). Create my own POCO / DTO's in it's own model project and these will essentially become my "domain model", but the mapping back into the context could be painful..
Lastly, do these POCO's / DTO's need to be in their own project?? What is there REALLY to gain.. seeing things like "YAGNI", i feel like it can sit right under the .edmx and never bother anyone..
On a side note, i will be needing a few of these via JSON too, so the whole serializable ability needs to be considered..
Can i just partial class the generated POCO's and only "Attribute" the properties I'll be needing?
anyhow, it'd be great to hear from past experience, or thoughts on the matter..
I could see this being in Programmers, but i figured I'd start it here.
We have a very similar situation, a large legacy DB2 database of which we need small portions of specific tables for our applications.
To do this we used entity framework code first models for the relevant subsections of data we were interested in. This meant we could do a few important things:
remove irrelevant data from the model to make code more discoverable
rename fields inside our model and map them to names that make sense in the app rather than existing column names
reduce the volume of data pulled back by queries (ie our selects dont grab all the extra bits)
where 2 formats of data exist use the modern standard rather than historical format
This works out really well for us, however a couple of things to note:
if you are writing make sure you include all required fields in the model
you can generate you CF classes but you will have to trim them a bit
generating from non mssql can sometimes be more tricky
In terms of json serialisation we do this too however we use a different model for this and use automapper to translate. You should in most cases be able to serialise without needing to add extra attributes but if they are required you can just add them to your pocos alongside any ef attributes.
What is the best practice for upgrading the database using ORM (DevExpress XPO, NHibernate or MS Entity Framework)?
I'm starting a new project and have to pick an ORM. The development process requires of releasing intermediate test builds quite often and likely that each build will have changes in the database structure. Each new version has to upgrade the DB gently to keep current data.
For old solutions I would provide a set of SQL scripts for upgrading the database from v1 to v2, from v2 to v3, etc. and execute them sequentially.
But how is it going to work for ORM? Should I still write SQL scripts to upgrade the DB?
I understand that simple adding new fields wouldn't cause a problem (e.g. see UpdateSchema() method for XPO), but what if I have to split a table and reallocate current records into 2 new tables?
I can't comment on the other ORM's, but I have used DevExpress XPO for a corporate treasury application since 2007. The schema changes a little with every release but there have also been some big schema changes over the years as well. A somewhat extended version of the default XPO upgrade mechanism has comfortably catered for all the changes.
There is good basic information here about upgrading XPO applications.
DevExpress provide a DBUpdater tool to assist you with the task of upgrading production environments. You can extend this tool to cater for additional requirements. In my application, we have added some options for logging, preview with rollback, etc.
Each module has virtual UpdateDatabaseBeforeSchemaUpdate() and UpdateDatabaseAfterSchemaUpdate() methods. You can significantly control the upgrade process within these.
As you mention, some of the upgrade will be handled automatically by XPO (e.g., adding a new column), but some things need additional control such as initialising the new column with a default value for existing records.
For instance, let's say MyNewField has been added to the MyEntity XPO class in version 2.0 of your application. Let's say it should default to a value of 3 for existing records. XPO will handle the creation of the new column but existing records will be NULL. (If you specify a default value in the XPO class it would only pertain to new records). In order to correct the value for existing records you would add something like the following to entity module's overridden UpdateDatabaseAfterSchemaUpdate():
public override void UpdateDatabaseAfterUpdateSchema()
{
base.UpdateDatabaseAfterUpdateSchema();
if (CurrentDBVersion < new Version(2, 0, 0, 0))
ObjectSpace.GetSession().ExecuteNonQuery(
"UPDATE [MyEntity] SET [MyNewField] = 3 WHERE [MyNewField] IS NULL");
}
(You could also use ObjectSpace.GetObjects<MyEntity>() and a foreach if you prefer to avoid the direct SQL.)
In your more extreme example of splitting a table in two, you can use the same method, but you would override UpdateDatabaseBeforeUpdateSchema() instead, run the SQL to split the table, let XPO perform any other schema updates and, if necessary, populate any default values in the UpdateDatabaseAfterUpdateSchema().
You will find that you bump into constraint problems e.g., foreign key violations so you might find you need to write some general routines such as DropAllForeignKeyConstraints() as part of the UpdateDatabaseBeforeUpdateSchema(). Sometimes you find that XPO already provide something, sometimes not. Missing constraints and indexes will get regenerated in the schema update. (In my experience switching a master data table's primary key turned out to be the hardest update routine to get right.)
By default the calls all happen in an SQL transaction so if anything fails it should all roll back.
The developers need to be aware of when a change to the domain model is likely to cause a problem with the underlying schema.
For testing, we keep a few old customer databases and run a bunch of before-and-after tests as part of the build process to make sure that existing customers are able to upgrade properly whatever version they are upgrading from. In production whenever we run into a problem upgrading, the problem data is added into this test library to prevent similar problems in the future.
We are dealing with major international companies and banks. The customers are quite happy with the result. In situations where a corporate's DBA needs to sign off on the changes, they don't seem to mind having a command line tool to do the upgrade rather than a script.
Most migration solutions can handle easy tasks, like adding new column, relationship or removing one, but fail to work when you rename a column (is that an add? or a remove following an add which equals a rename? What should you do with the data in that case?)
All three solutions have basic migrations support, XPO even lets you run your own scripts as a part of the process (to insert static/test/contant data, etc.)
There's also the MigratorDotNet project that you can use and not to rely on any ORM specific feature regarding migrations.
Personally, I would use auto migration only in dev/test environment and would have full set of upgrade scripts when running on client specific database to say upgrade from v1 to v2.
How is it going to work for ORM? Should I still write SQL scripts to
upgrade the DB?
Clear answer of this question should be on Programmer's stackexchange thread - What are the criteria for evaluating an ORM for.NET?, there i got simple answer for your question that you asked and matches with my experience with ORM while developing some project with Entity framework and Code smith ORM templates.
How does the ORM manages changes in the data model? what if I have to split a table and reallocate current records into 2 new tables?
Some can update the DB automatically within a certain measure, other
don't do anything and you'll have to do the dirty work yourself; other
provide a framework for handling change that lets you control database
updates. That means every couple of days someone needs to spend an hour updating the model to add a table or change datatypes that are changing
Ref:
https://softwareengineering.stackexchange.com/questions/6543/what-are-the-benefits-of-using-database-abstraction-by-orm
https://softwareengineering.stackexchange.com/questions/41739/best-arguments-for-against-introducing-orm-technology-into-a-companies-dev-proce/41833#41833
If you ask - what is the best practice for upgrading the db using ORM - my answer is: Don't use it if your application is more than a hobbyist app.
There are a lot of scenarios where many ORMs are unable to provide support to your specific database needs, e.g. in creating stored procedures, create indices and views or even indexed views/materialized tables without writing sql scripts. Problems like adding a new non-nullable column to an existing table are much harder to solve in ORM-Migration-Code than by writing SQL scripts.
Current Tools like Visual Studio Data Tools do handle these kind of problems way better.
I have a database that I wish to build an EF model from, however I do not want to include certain columns from the database as the columns concerned are maintained exclusively on the server and should not be manipulated by any application.
Both of the columns are DateTime (if this makes any difference), one of the columns is nullable and is maintained by a trigger on updates and the other is not nullable and set using a default value in the table definition.
I guess I am looking for something like the "Server Generated" option in Linq2Sql; but I cannot find such an option.
Can anybody tell me how to work around this?
Caveat:
I have been trying to introduce business object modelling at my place of work for some years and it has always been rejected because of the amount of additional code that has to be hand-cranked. EF is currently being seen as a viable solution because of the designer and code generation therefore any option that involves hand-cranking the XML will only turn the rest of my colleagues away from EF. I am therefore looking for something that can be done either using the designer or using code.
EDIT:
I guess that what I am looking for here is either...
(a) a way to create the model without EF referencing the columns in the store (ssdl) and therefore not looking to manipulate it in any way
(b) a way to programatically set the "StoreGeneratedPattern" attribute against the property when I create the ObjectContext (the easy answer is to manually manipulate this in the .ssdl, but this would then be overwritten if I refreshed the model from the database and I cannot go down the route where the .csdl, .msl & .ssdl are hand-cranked).
Can you do this with the Entity Framework? Yes; it's easy. Can you do this with the Entity Framework designer? Unfortunately, that is much harder.
The problem you're having is that the column exists in the storage schema (SSDL) in your EDMX. Removing the column with the GUI designer simply removes it from the client schema, not the mapping or the storage schema. However, it's simple enough to go into the EDMX and remove it. Having done that, you can also remove it from the mapping in the client schema portions of the EDMX, and the entity framework will longer complain that it is unmapped.
Problem solved, right?
Well, no. When you use the GUI designer to update the EDMX from the database, the storage schema is thrown away and re-generated. So your column will come back. As far as I know, there is no way to tell the GUI designer to never map a particular column. So you will have to re-do this every time you update with the GUI designer. Fortunately, the EDMX is XML, so you can do this with a XML transform, LINQ, or the XML tool of your choice.
Can you not create a view with the columns you need and import it through entity function wizard and map it to your entities?
You could modify the text template to ignore these columns when generating your entity classes. For example if you added "IGNORE" to the documentation summary, you could modify the template to ignore them by replacing;
Dim simpleProperties as IEnumerable(Of EdmProperty) = typeMapper.GetSimpleProperties(entity)
with;
Dim simpleProperties as IEnumerable(Of EdmProperty) = typeMapper.GetSimpleProperties(entity).Where(Function(p) p.Documentation is nothing orelse p.Documentation.Summary.IndexOf("IGNORE")<0)
Right click on the field in the graphical representation and choose delete. Ive found that sometimes you will get errors when you make a lot of changes to the modeling at once and start to lose track of your changes. Your best bet might be to rebuild the EF generated model.
Keep in mind that when you "update from the database", that old fields on the generated models will not be removed, you will have to remove them manually. For example if you renamed DateField1 to DateField2 in your database, and then you "Update Model from Database", you will now see both DateField1 and DateField2 on the resultant model. This can be a cause of errors.
Do you not want the column to appear in the model at all?
Try selecting the column in the Designer view and hitting the delete key.
Edit
You could make the setter for the property private. Then your app won't be able to modify the value.
Timestamp is a different data type than DateTime. Timestamp seems to be recognized as an attribute the engine manages, much like an identity attribute. You can't "update" a timestamp attribute. Hence, the EDM can manage it correctly (just as it does an identity).
In EDMX designer, select the property and set StoreGeneratedPattern to Computed.