I have written an application using EF 6.0 in combination with an SQL Server Compact 4.0 Database. When a customer uses this application for the first time, it (the application) should create a database-file in a given path with some initital values. Also migrations should be allowed, for it is quite possible that the object model might change with future versions of the app.
Now I´m wondering what would be the best way to to deploy the DB on the users productive system. I could think of three ways:
I could create a DB-file with initial values and just copy it to the right place during installation process and use MigrateDatabaseToLatestVersionInitializer in the app.
In the DbContext-Constructors (I have two contexts) I could check for an existing DB-file and use different Database-Initializers accordingly. Like a CreateDatabaseIfNotExistsInitializer with a seed method that creates initial data if no fiel is found and a MigrateDatabaseToLatestVersionInitializer if the DB-file exists.
I could use the MigrateDatabaseToLatestVersionInitializer always and in its "Seed"-method check for existing table entries and create them if they are not present.
Which of these ways is to be preferred or is there a better way I didn´t think of?
It sounds like this is a desktop application so you might want to catch permissions errors about creating the database file at installation time (i.e. option 1) rather than run time, especially as in option 2 the database initialization is not an imperative command you're giving that you can put a try...catch around.
I don't think option 3 would work as the Seed method gets run after all the migrations, so surely the migrations will either have successfully run, in which case the tables don't need creating, or they will have failed as the DB doesn't exist and therefore your Seed method won't get run.
Related
Inside of OnModelCreating, I want to be able to ignore a column if the database is on an older migration EF Core 5 throws an exception if I attempt to read from the database directly, or indirectly by querying the applied migrations. I'm not certian that it's even a good idea, since OnModelCreating is used during the migration 😩, but I'll burn that bridge when I cross it.
There are some examples on how one would do this with EF6, but they don't seem to apply anymore with EF Core.
While Ivan Stoev is right that --generally-- you should model the target database without outside input, the real world isn't always that clear-cut. In my particular case, there are multiple service instances (Azure Functions) that need to read and write to a single database. In order to maintain zero downtime, those Functions need to not read or write columns that don't yet exist.
I solved the problem the way Serge suggested. The database has a known version, populated with seed data that increments with every migration. On startup, the service reads that version with a regular old Microsoft.Data.Sql.SqlConnection. This version is then added to the IServiceCollection as a singleton to be used by the DbContext constructor.
When talking to an older database version, OnModelCreating does things like this:
builder.Entity<Widget>(w =>
{
// another option would be to use the migrations table instead of an integer
if (DatabaseVersion < ContextVersions.WidgetNewPropertyAddedVersion)
{
w.Ignore(w => w.NewProperty);
}
else
{
w.Property(w => w.NewProperty)
.HasDefaultValue(0);
}
});
The startup code also detects if it's been started by the Entity Framework tools and does not read the database version, instead assuming "latest". This way, we do not ignore new properties when building the migration.
Figuring out how to let the service instances know that the database has been upgraded and they should restart to get the new database model is an exercise left up to the reader. :)
I´m getting PK Violation Exception when using EF Core 2.1 DbContext in an Azure QueueTrigger function. Guess is due to the nature of DbContext not being thread-safe, and the Azure Function running different instances in parallel. I have read quite a few, but I can´t find a good approach to solve this.
Here is my scenario (producer-consumer pattern):
I have a Scheduled Azure Function that is calling an API to get Projects from different external systems. To get all the required info for a project, I need to run different Queries to other external services, so I´m decoupling this to another Azure function, so the Scheduled function just queues a message per Project, as “Sync Project ID 101”.
Another QueueTrigger Function fires every time a message is queued, so, it means different instances running in parallel. This function must gather all the data of a specific Project, and that means more calls to other external services / APIs, to (some kind of) aggregate all the info about a Project. IMHO it´s good to do it that way, as I can process multiple Projects in parallel, and I can scale the Function if I need it.
Once I have all this Project info, I want to persist it in a SQL DB using EF Core (and here comes the issue)
Project data includes Users in the Project, and each user have a specific GUID as PK (coming from the external system). That means I can have repeated Users IDs in different Function instances, and here is the problem, as when I try to persist User info in a SQL Table, I can get PK Duplication exception, as multiple Function instances can try to Insert the same User at the same time (when the instance A check if user exists, it gets False, but another instance B is actually adding this User, so when instance A tries the Insert, it fails).
Guess I can lock DbContext somehow, but not sure if is good, as I also have a website doing Queries to the SQL DB (read-only queries for now, but could be updates in future too).
Another idea could be to send the entire Project info to another Queue / Blob file, and have another function in Singleton mode that Insert the data into SQL.
I´ve created this project simplifying my scenario, but enough to reproduce the issue and understand the problem.
https://github.com/luismanez/queuetrigger-efcore-multithreading
Any other ideas or recommended approaches? (open to change the architecture if find something better)
Many thanks!
A "more easy" way could be to do some kind of upsert in the database. There is a sample of how to do that with EF Core: https://www.flexlabs.org/2018/02/adding-upsert-support-for-entity-framework-core
When an application is Live an iterative approach to database changes is obviously required. In the db first world I would change the object (eg. column added to table) in the databaae project, then deploy (recreate) to my local instance, then replace the old table with the new in my edmx - when it was go Live time a delta script is generated out of the database project compared to a copy of the Live database schema. Sounds long winded but at the end of the day I only made the change once (the object in the db project) - everything else is generated
Flip over to code first (EF6) and Im expecting a similar one change experience - i.e. I add the property to the class - however do I additionally need to add a migration script ?
I've been reading and it seems many advise to disable migrations to have more control - I'm confused - I had visions of simply deploying the app and the changes automatically reflected in the target database the next time the app runs - one thing is for sure I don't want to manually write separate deployment scripts (or migration code). As mentioned I'm confused about this final part - can anyone clarify - point out the options
Many thanks
Short answer is there's no need to create a 'migration script', you're correct as EF will handle it for you if you want. I think when you read about disabling migrations, you were probably actually reading 'disable automatic migrations'; EF will still generate migrations regardless.
As you pointed out it IS a two-change process when developing: First you change your class, then you open up the Package Manager console and call Add-Migration. Usually, that's all you have to do, and EF will generate the change code for you. Then, you call Update-Database and it does it's work. When you go to deploy, you will connect to your target database and call Update-Database once and it will apply all migrations that are pending.
You can also enable auto-migrations which skips the Add-Migration step, but I always like to review the generated code. Call me old-fashioned ;)
It gets more complicated when you need support for views, SPROCS, and UDFs, but there are ways to do most anything you want to do. And, even though it's a 2 (3?) step process to get changes out to the DB, it's still much easier than changing the DB and code separately, by yourself.
Then, you can follow the steps here to set your deployment up so that once your EF is initialized on a connection to your production DB, it automatically applies the updates. Again, I would advise to do it yourself (via the package manager console) just to be safe but it's not necessary.
My database model (sometimes referred to as "context") is dynamically assembled at startup based on which services and/or plugins are installed. Plugins and services export their model definition fragments through my IoC container and the application core picks them up and runs them when the DbContext.OnModelCreating method is called.
The question is: Can I (and how do I) use Code First Migrations with this setup?
(below is more information on what I've tried and what particular problems are)
In my previous project, the database was inherited from some old code so we couldn't use any of the Code First database generation stuff anyway. We simply kept a long line of delta scripts and executed them manually on deploy (it was a single-host kind of project).
Now I'm starting a new project, and this time, the database is brand new, ready for Code First to play with. Initially, I was all excited about Code First Migrations, seemed like the way to go. Until I actually tried it. The initial attempt, quite obviously, failed due to the absence of an explicitly defined DbContext in my project.
So far, it looks like the only viable option is to manually code migrations, with which I am perfectly fine. However, it turns out that this is not as simple as just creating a few classes inherited from DbMigration.
After some experimentation on a small test project, I was able to find out that the migration autogenerator adds an implementation of IMigrationMetadata, which, among other things, contains a hash of my model as the values of the Source and Target properties. Presumably, this hash is then used to identify a path across migrations from the "current" state of the database (as recorded in the __MigrationHistory table) to the newest state as defined by the model in code. This totally makes sense, but...
Naturally, I have no idea where to get that hash for my model, which makes me unable to implement IMigrationMetadata on my migrations.
On the other hand, I see that the metadata interface is not included in the DbMigration class itself, which makes me think that it might be optional. It then follows that migrations can actually work without the hash values, but the question is - how?
All the information I could find on the internet is just simple, very basic tutorials. No information on how to create migrations manually (and whether it's even supported). No documentation on how it actually works and how to extend it. And it is not quite obvious from outside.
I am ready to resort to ILSpy at this point, but the whole EF is so complex that I fear I may not be able to find what I need in reasonable time.
Here are a few ideas that you could pull together to find a solution that works for you. I realize I mentioned some of these in our other thread, but I'm including them here for others reading this question.
Automatic migrations allow Code First to automatically calculate and apply changes to the database.
You can write your own code to generate and apply migrations. I've written a post about applying migrations and the MigrationsScaffolder class will help you create migrations.
When you run the project , an extra table is created in the database.
EdmMetadata table
The hash is always created with the help of EdmMetadata Entity and the current code first model. It is SHA-256 hash stored in the EdmMetadata table of the database. You can get it from that table.
Methodology to be followed will be:
Get the hash of the current model using
var hash=GetModelHash(OldContext);
Check whether the model in the code (new model) is compatible with the model in database(old model) using
CompatibleWithModel(hash,CurrentContext,ObjectContext)
This method returns bool.
If it is not compatible, then delete the existing tables in the database.
Create new tables
Save the current hash to the databse
Seed the data.
The code may look like:
{
var objectContext = ((IObjectContextAdapter)context).ObjectContext;
var modelHash = GetModelHash(objectContext);
if (CompatibleWithModel(modelHash, context, objectContext))
return;
DeleteExistingTables(objectContext);
CreateTables(objectContext);
SaveModelHashToDatabase(context, modelHash, objectContext);
SeedData(context);
}
Be sure to make the class inherited from
IDatabaseInitializer<T> where T:DbContext
What is the best practice for upgrading the database using ORM (DevExpress XPO, NHibernate or MS Entity Framework)?
I'm starting a new project and have to pick an ORM. The development process requires of releasing intermediate test builds quite often and likely that each build will have changes in the database structure. Each new version has to upgrade the DB gently to keep current data.
For old solutions I would provide a set of SQL scripts for upgrading the database from v1 to v2, from v2 to v3, etc. and execute them sequentially.
But how is it going to work for ORM? Should I still write SQL scripts to upgrade the DB?
I understand that simple adding new fields wouldn't cause a problem (e.g. see UpdateSchema() method for XPO), but what if I have to split a table and reallocate current records into 2 new tables?
I can't comment on the other ORM's, but I have used DevExpress XPO for a corporate treasury application since 2007. The schema changes a little with every release but there have also been some big schema changes over the years as well. A somewhat extended version of the default XPO upgrade mechanism has comfortably catered for all the changes.
There is good basic information here about upgrading XPO applications.
DevExpress provide a DBUpdater tool to assist you with the task of upgrading production environments. You can extend this tool to cater for additional requirements. In my application, we have added some options for logging, preview with rollback, etc.
Each module has virtual UpdateDatabaseBeforeSchemaUpdate() and UpdateDatabaseAfterSchemaUpdate() methods. You can significantly control the upgrade process within these.
As you mention, some of the upgrade will be handled automatically by XPO (e.g., adding a new column), but some things need additional control such as initialising the new column with a default value for existing records.
For instance, let's say MyNewField has been added to the MyEntity XPO class in version 2.0 of your application. Let's say it should default to a value of 3 for existing records. XPO will handle the creation of the new column but existing records will be NULL. (If you specify a default value in the XPO class it would only pertain to new records). In order to correct the value for existing records you would add something like the following to entity module's overridden UpdateDatabaseAfterSchemaUpdate():
public override void UpdateDatabaseAfterUpdateSchema()
{
base.UpdateDatabaseAfterUpdateSchema();
if (CurrentDBVersion < new Version(2, 0, 0, 0))
ObjectSpace.GetSession().ExecuteNonQuery(
"UPDATE [MyEntity] SET [MyNewField] = 3 WHERE [MyNewField] IS NULL");
}
(You could also use ObjectSpace.GetObjects<MyEntity>() and a foreach if you prefer to avoid the direct SQL.)
In your more extreme example of splitting a table in two, you can use the same method, but you would override UpdateDatabaseBeforeUpdateSchema() instead, run the SQL to split the table, let XPO perform any other schema updates and, if necessary, populate any default values in the UpdateDatabaseAfterUpdateSchema().
You will find that you bump into constraint problems e.g., foreign key violations so you might find you need to write some general routines such as DropAllForeignKeyConstraints() as part of the UpdateDatabaseBeforeUpdateSchema(). Sometimes you find that XPO already provide something, sometimes not. Missing constraints and indexes will get regenerated in the schema update. (In my experience switching a master data table's primary key turned out to be the hardest update routine to get right.)
By default the calls all happen in an SQL transaction so if anything fails it should all roll back.
The developers need to be aware of when a change to the domain model is likely to cause a problem with the underlying schema.
For testing, we keep a few old customer databases and run a bunch of before-and-after tests as part of the build process to make sure that existing customers are able to upgrade properly whatever version they are upgrading from. In production whenever we run into a problem upgrading, the problem data is added into this test library to prevent similar problems in the future.
We are dealing with major international companies and banks. The customers are quite happy with the result. In situations where a corporate's DBA needs to sign off on the changes, they don't seem to mind having a command line tool to do the upgrade rather than a script.
Most migration solutions can handle easy tasks, like adding new column, relationship or removing one, but fail to work when you rename a column (is that an add? or a remove following an add which equals a rename? What should you do with the data in that case?)
All three solutions have basic migrations support, XPO even lets you run your own scripts as a part of the process (to insert static/test/contant data, etc.)
There's also the MigratorDotNet project that you can use and not to rely on any ORM specific feature regarding migrations.
Personally, I would use auto migration only in dev/test environment and would have full set of upgrade scripts when running on client specific database to say upgrade from v1 to v2.
How is it going to work for ORM? Should I still write SQL scripts to
upgrade the DB?
Clear answer of this question should be on Programmer's stackexchange thread - What are the criteria for evaluating an ORM for.NET?, there i got simple answer for your question that you asked and matches with my experience with ORM while developing some project with Entity framework and Code smith ORM templates.
How does the ORM manages changes in the data model? what if I have to split a table and reallocate current records into 2 new tables?
Some can update the DB automatically within a certain measure, other
don't do anything and you'll have to do the dirty work yourself; other
provide a framework for handling change that lets you control database
updates. That means every couple of days someone needs to spend an hour updating the model to add a table or change datatypes that are changing
Ref:
https://softwareengineering.stackexchange.com/questions/6543/what-are-the-benefits-of-using-database-abstraction-by-orm
https://softwareengineering.stackexchange.com/questions/41739/best-arguments-for-against-introducing-orm-technology-into-a-companies-dev-proce/41833#41833
If you ask - what is the best practice for upgrading the db using ORM - my answer is: Don't use it if your application is more than a hobbyist app.
There are a lot of scenarios where many ORMs are unable to provide support to your specific database needs, e.g. in creating stored procedures, create indices and views or even indexed views/materialized tables without writing sql scripts. Problems like adding a new non-nullable column to an existing table are much harder to solve in ORM-Migration-Code than by writing SQL scripts.
Current Tools like Visual Studio Data Tools do handle these kind of problems way better.