What are the limitations using DACPAC in SQL Server deployment - sql-server-2019

I'm exploring the DACPAC feature on the SQL Server database deployments.
I'm using the EXTRACT action to get the DACPAC generated from the source and the PUBLISH action to deploy it to the target.
Extract
sqlpackage.exe /Action:Extract /SourceDatabaseName:%sourcedatabaseName% /SourceServerName:%sourceserverName% /TargetFile:%FilePath%
Publish
sqlpackage.exe /Action:Publish /SourceFile:%SourceFile% /TargetServerName:%serverName% /TargetDatabaseName:%databaseName%
Here, when I have new columns introduced in the source table when I do the DACPAC deployment, it works fine. The new columns are reflected in the target.
But, when I drop columns in the source and do the DACPAC deployment, the changes are not reflected. The column is not getting dropped in the target. Is it because I have data in that column?
In the other scenario, I have some test tables and test stored procedures in the source, when I generate DACPAC and do the deployment the same test tables and stored procedures are getting deployed in the target. Is there a way to restrict this?
So would like to understand what are all the limitations of using DACPAC?
Using SQL Server 2019.

Column removal from non-empty table could lead to data loss.
It could be overriden with: /p:BlockOnPossibleDataLoss=false
DacDeployOptions.BlockOnPossibleDataLoss Property
Get or set boolean that specifies whether deployment should stop if the operation could cause data loss.
True to stop deployment if possible data loss if detected; otherwise, false. Default is true.

Related

AWS DMS task fails to retrieve tables

I'm trying to migrate existing data and replicate ongoing changes
the source database is PostgreSQL it's managed by aws.
the target is kafka.
I'm facing the below issue.
Last Error No tables were found at task initialization. Either the selected table(s) or schemas(s) no longer exist or no match was found for the table selection pattern(s). If you would like to start a Task that does not initially capture any tables, set Task Setting FailOnNoTablesCaptured to false and restart task. Stop Reason FATAL_ERROR Error Level FATAL

Azure DevOps Yaml - Update dynamic # of DBs

Currently, I am using DevOps Pipelines YAML to run a flyway cli to perform migrations to a single Database. When code gets pushed, yaml triggers and runs flyway which updates DB.
We are planning to create multiple instances of this DB. However, the DB instances are created dynamically. We plan to store connection strings in a master DB + Key Vault.
Is it possible to achieve the following?
Code gets commit
YAML queries master DB, and gets X number connection strings
YAML loops X, and runs flyway for each connection string
All DBs gets updated
I do not think there is such a way. An alternative I can think of is create a Console Application that does the querying + calling flyway, and YAML just calls this in the build server.

Best practice for running database schema migrations

Build servers are generally detached from the VPC running the instance. Be it Cloud Build on GCP, or utilising one of the many CI tools out there (CircleCI, Codeship etc), thus running DB schema updates is particularly challenging.
So, it makes me wonder.... When's the best place to run database schema migrations?
From my perspective, there are four opportunities to automatically run schema migrations or seeds within a CD pipeline:
Within the build phase
On instance startup
Via a warm-up script (synchronously or asynchronously)
Via an endpoint, either automatically or manually called post deployment
The primary issue with option 1 is security. With Google Cloud Sql/Google Cloud Build, it's been possible for me to run (with much struggle), schema migrations/seeds via a build step and a SQL proxy. To be honest, it was a total ball-ache to set up...but it works.
My latest project is utilising MongoDb, for which I've connected in migrate-mongo if I ever need to move some data around/seed some data. Unfortunately there is no such SQL proxy to securely connect MongoDb (atlas) to Cloud Build (or any other CI tools) as it doesn't run in the instance's VPC. Thus, it's a dead-end in my eyes.
I'm therefore warming (no pun intended) to the warm-up script concept.
With App Engine, the warm-up script is called prior to traffic being served, and on the host which would already have access via the VPC. The warmup script is meant to be used for opening up database connections to speed up connectivity, but assuming there are no outstanding migrations, it'd be doing exactly that - a very light-weight select statement.
Can anyone think of any issues with this approach?
Option 4 is also suitable (it's essentially the same thing). There may be a bit more protection required on these endpoints though - especially if a "down" migration script exists(!)
It's hard to answer you because it's an opinion based question!
Here my thoughts about your propositions
It's the best solution for me. Of course you have to take care to only add field and not to delete or remove existing schema field. Like this, you can update your schema during the Build phase, then deploy. The new deployment will take the new schema and the obsolete field will no longer be used. On the next schema update, you will be able to delete these obsolete field and clean your schema.
This solution will decrease your cold start performance. It's not a suitable solution
Same remark as before, in addition to be sticky to App Engine infrastructure and way of working.
No real advantage compare to the solution 1.
About security, Cloud Build will be able to work with worker pool soon. Still in alpha but I expect in the next month an alpha release of it.

Separate Spark AWS Glue Metastore entries by environment (test vs prod)

I plan to run my Spark SQL jobs on AWS's EMR, and I plan to use AWS's Glue Metastore to persist tables' schema and file location metadata. The problem I'm facing is I'm not sure how to isolate our test vs prod environments. There are times when I might add a new column to a table, and I want to test that logic in the test environment before making the change to production. It seems that the Glue Metastore only supports one entry per database-table pair, which means that test and prod would point to the same Glue Metastore record, so whatever change I make to the test environment would also immediately impact prod. How have others tackled this issue?

Can an Entity Framework migration be run without performing the Seed

I am using Entity Framework (version 6.1.3) - Code First - for my application.
The application is hosted on the Azure platform, and uses Azure SQL Databases.
I have a database instance in two different regions, and I am using the Sync Preview to keep the data in sync.
Since the sync takes care of ensuring the data is kept synchronised, when I run a migration, I'd like the schema changes and seed to happen in only one database, and the schema changes only (with no seed) in the other.
Is this possible with the EF tooling, or do I need to move the seeding out to a manual script?
This is possible by spreading out your deployment.
if worker role 1 updates your database and seed
if after the sync worker role 2 connects to your other database it will see that the migration already took place.
One way to trigger this is to disable automatic migrations on all but 1 worker role. The problem is that you potentially have to deal with downtime/issues while part of your application landscape is updated/migrated but your database is still syncing.
(worker role can also be replaced by webjob , website etc )