How does EF applies migrations on a database? - entity-framework-core

I have an app with 4 environments: DEV, QA, Stage and Production.
When I deployed to the Stage environment, the order of the migrations that were applied was not respected. What could cause this ?
DEV + QA Migration History table:
20220607093209_UpdateEquipmentTypesNameInQuotes
20220607114458_UpdateEquipmentTypesNameInOrders
20220607121426_UpdateEquipmentTypesNameInQuotesError
20220617074932_ChangeQuoteStatusPriority
20220617133432_RevertUpdateEquipmentTypesNameInQuotes
20220617133916_RevertUpdateEquipmentTypesNameInOrders
20220617134335_RevertUpdateEquipmentTypesNameInQuotesError
20220620095114_UpdateNameEquipmentTypeNameInQuotes2
20220620103237_UpdateNameEquipmentTypeNameInQuotesError2
20220620103455_UpdateNameEquipmentTypeNameInOrders2
Stage + Production:
20220620095114_UpdateNameEquipmentTypeNameInQuotes2
20220620103237_UpdateNameEquipmentTypeNameInQuotesError2
20220620103455_UpdateNameEquipmentTypeNameInOrders2
20220607093209_UpdateEquipmentTypesNameInQuotes
20220607114458_UpdateEquipmentTypesNameInOrders
20220607121426_UpdateEquipmentTypesNameInQuotesError
20220617074932_ChangeQuoteStatusPriority
20220617133432_RevertUpdateEquipmentTypesNameInQuotes
20220617133916_RevertUpdateEquipmentTypesNameInOrders
20220617134335_RevertUpdateEquipmentTypesNameInQuotesError
As you can see, in the Stage + Production environments, the order of the migrations are somewhat mixed up.
Maybe one thing to note is that we did an upgrade of EF Core versions. We went from 3.1.25 to 6.0.6.

The order in the table doesn't necessarily represent the order they were applied. If they were all applied in one transaction, SQL Server could have shuffled the order of the rows around when inserting them.

Related

EF db migrations with rolling/canary deployments

I have a .NET application that is using EF Core as ORM, and all db modifications are done using Db Migrations in EF.
The application is hosted on the cloud on multiple VMs in production, after do all testing, a rolling deployment is initiated to take one VM at a time, deploy the new application, and so on.
The database itself hosted on managed Db service (Like aws RDS, Azure SQL) with multi-az/replication setup.
The main goal, is to make sure there is no downtime (0 downtime), and rollback if any issue happened (or manually distribute canary weighted requests accordingly)
the main issue is, if application successfully deployed to one instance, and that instance receives a connection, this will cause the database to be migrated to the new version, causing all other instances requests to fail (as EF will have different migrated db in the old instances)

What are the limitations using DACPAC in SQL Server deployment

I'm exploring the DACPAC feature on the SQL Server database deployments.
I'm using the EXTRACT action to get the DACPAC generated from the source and the PUBLISH action to deploy it to the target.
Extract
sqlpackage.exe /Action:Extract /SourceDatabaseName:%sourcedatabaseName% /SourceServerName:%sourceserverName% /TargetFile:%FilePath%
Publish
sqlpackage.exe /Action:Publish /SourceFile:%SourceFile% /TargetServerName:%serverName% /TargetDatabaseName:%databaseName%
Here, when I have new columns introduced in the source table when I do the DACPAC deployment, it works fine. The new columns are reflected in the target.
But, when I drop columns in the source and do the DACPAC deployment, the changes are not reflected. The column is not getting dropped in the target. Is it because I have data in that column?
In the other scenario, I have some test tables and test stored procedures in the source, when I generate DACPAC and do the deployment the same test tables and stored procedures are getting deployed in the target. Is there a way to restrict this?
So would like to understand what are all the limitations of using DACPAC?
Using SQL Server 2019.
Column removal from non-empty table could lead to data loss.
It could be overriden with: /p:BlockOnPossibleDataLoss=false
DacDeployOptions.BlockOnPossibleDataLoss Property
Get or set boolean that specifies whether deployment should stop if the operation could cause data loss.
True to stop deployment if possible data loss if detected; otherwise, false. Default is true.

Best practice for running database schema migrations

Build servers are generally detached from the VPC running the instance. Be it Cloud Build on GCP, or utilising one of the many CI tools out there (CircleCI, Codeship etc), thus running DB schema updates is particularly challenging.
So, it makes me wonder.... When's the best place to run database schema migrations?
From my perspective, there are four opportunities to automatically run schema migrations or seeds within a CD pipeline:
Within the build phase
On instance startup
Via a warm-up script (synchronously or asynchronously)
Via an endpoint, either automatically or manually called post deployment
The primary issue with option 1 is security. With Google Cloud Sql/Google Cloud Build, it's been possible for me to run (with much struggle), schema migrations/seeds via a build step and a SQL proxy. To be honest, it was a total ball-ache to set up...but it works.
My latest project is utilising MongoDb, for which I've connected in migrate-mongo if I ever need to move some data around/seed some data. Unfortunately there is no such SQL proxy to securely connect MongoDb (atlas) to Cloud Build (or any other CI tools) as it doesn't run in the instance's VPC. Thus, it's a dead-end in my eyes.
I'm therefore warming (no pun intended) to the warm-up script concept.
With App Engine, the warm-up script is called prior to traffic being served, and on the host which would already have access via the VPC. The warmup script is meant to be used for opening up database connections to speed up connectivity, but assuming there are no outstanding migrations, it'd be doing exactly that - a very light-weight select statement.
Can anyone think of any issues with this approach?
Option 4 is also suitable (it's essentially the same thing). There may be a bit more protection required on these endpoints though - especially if a "down" migration script exists(!)
It's hard to answer you because it's an opinion based question!
Here my thoughts about your propositions
It's the best solution for me. Of course you have to take care to only add field and not to delete or remove existing schema field. Like this, you can update your schema during the Build phase, then deploy. The new deployment will take the new schema and the obsolete field will no longer be used. On the next schema update, you will be able to delete these obsolete field and clean your schema.
This solution will decrease your cold start performance. It's not a suitable solution
Same remark as before, in addition to be sticky to App Engine infrastructure and way of working.
No real advantage compare to the solution 1.
About security, Cloud Build will be able to work with worker pool soon. Still in alpha but I expect in the next month an alpha release of it.

Manually marking flyway migration as completed

Couple of days ago we made a mistake. We have a kubernetes cluster with a pipeline that times out in 25 minutes, meaning if the deployment wasn't done in 25 minutes, it will fail. We deployed a flyway migration that involves some queries that run for more than an hour. Stupid, I know. now we ran the queries in the migration manually, We want to manually mark the flyway migration as done, otherwise redeployment won't work. Is there a way this could be done?
So we ended up manually inserting a migration row in the database. flyway keeps a table flyway_schema_history in your schema. If you manually insert a row there it will skip the migration. The only tricky part is calculating the checksum. You can either migrate locally, get the checksum and inject it the live database, or just re-calculate the checksum on your own.
You will find how they calculate the checksum in the AbstractLoadableResource class.

Can an Entity Framework migration be run without performing the Seed

I am using Entity Framework (version 6.1.3) - Code First - for my application.
The application is hosted on the Azure platform, and uses Azure SQL Databases.
I have a database instance in two different regions, and I am using the Sync Preview to keep the data in sync.
Since the sync takes care of ensuring the data is kept synchronised, when I run a migration, I'd like the schema changes and seed to happen in only one database, and the schema changes only (with no seed) in the other.
Is this possible with the EF tooling, or do I need to move the seeding out to a manual script?
This is possible by spreading out your deployment.
if worker role 1 updates your database and seed
if after the sync worker role 2 connects to your other database it will see that the migration already took place.
One way to trigger this is to disable automatic migrations on all but 1 worker role. The problem is that you potentially have to deal with downtime/issues while part of your application landscape is updated/migrated but your database is still syncing.
(worker role can also be replaced by webjob , website etc )