JPA live db schema migration support tools? - jpa

There are quite a few stackoverflow threads regarding JPA db schema migration and assorted tools. However, none seems to even consider service downtimes that can be too long when doing the suggested offline schema migrations for huge databases.
So here is my first thought:
Let's assume I want to refactor one JPA entity into two JPA entities, ie. "Truck" into "Truck" and "Engine" (move the Engine attributes into a separate entity). The migration plan could look like:
create the two new JPA entities "TruckNew" and "Engine".
adjust the DAO (or whatever) accessing "Truck", "TruckNew" and "Engine" to:
use "Truck" as a fallback
run a separate data migration thread that converts entities from "Truck" to "TruckNew" and "Engine", thereby doing the migration in the background without downtime and transparent to the rest of the application(s).
clean up stuff, rename "TruckNew" to "Truck"
Now the question: are there any support tools for such a task? any JPA provider features that take care of at least part of that work?

I see the idea and it looks possible but wouldn't it be a better approach to clone your database to a migration_db. Migrate your schema using tools/scripts, then test the modifications using unit tests with the new code. Lastly point your environment to configuration to the new installation of the migrated code base with the updated entities.
One downside is that transactions that were not replicated to the cloned database would be lost but this is possible also using just about any transactional database.

Related

How to keep a history of edit of Entities in a JPA application

A JavaEE and JPA application need to keep a record of all the changes made by the user.
Currently, for all the entities, there are fields to record createdBy and lastEditedBy properties. Yet, the requirement of recording all edits is not possible with those properties.
What is the best way to record the history of all edits for a particular entity?
I do not use Spring.
You can use Javers which is db and framework agnostic tool for maintaining operation history.
There are two big differences between JaVers and Envers:
Envers is the Hibernate plugin. It has good integration with Hibernate
but you can use it only with traditional SQL databases. If you chose
NoSQL database or SQL but with another persistence framework (for
example JOOQ) — Envers is not an option.
On the contrary, JaVers can be used with any kind of database and any
kind of persistence framework. For now, JaVers comes with repository
implementations for MongoDB and popular SQL databases. Other databases
(like Cassandra, Elastic) might be added in the future.
Envers’ audit model is table-oriented. You can think about Envers as
the tool for versioning database records.
JaVers’ audit model is object-oriented. It’s all about objects’
Snapshots. JaVers saves them to the single table (or the collection in
Mongo) as JSON documents with unified structure.
You can also achieve this using triggers and storing object differences.
Edit:
JaversAuditableAspect for any kind of repository.
It defines the pointcut on any method annotated with the method-level #JaversAuditable annotation. Choose it if you have repositories that are not managed by Spring Data.
#Bean public JaversAuditableAspect javersAuditableAspect() { return new JaversAuditableAspect(javers(), authorProvider(), commitPropertiesProvider()); }
You can use Hibernate's Envers to audit your entities. It allow you to keep track of ALL changes made to entities - even deleted ones. Most probably you are already using Hibernates (as JPA provider) so integration should be a no problem.
https://hibernate.org/orm/envers/

What's the point of running an EF migration when you can SQL directly in database?

How to create View (SQL) from Entity Framework in ABP Framework
Not allowed to post comments because of reputation. Just trying to get more information on connecting a database to an Entity Framework, without having to switch to a code-first development style. View selected answer's response (he told the OP to basically do the same thing he was going to do in the DB but with EF, and then added an extra step where EF "...ignores..." the previous instructions...
I want to create tables and design database directly in SQL, and have the csharp library just read/write the table values (kind of like how dapper function where it isnt replacing your database, just working along side of it).
The tutorials don't talk about how to integrate your databases with your project. It either brushes over the subject, ignores it completely, or discusses how to replace it.
I don't want to do any EF migrations (i dont want/need to destroy/create database everytime i decide to run, duplicate, or transfer project). Any and all database back-track (back-up/restore) should be done with and thru SQL (within my work environment).
Just to be clear on exactly what i'm trying to learn:
How does somebody who specializes in database administration (building database schema, managing and monitoring data, and has existing database with data established) connect to project to fetch data (again, specifically referencing Dapper's Query functionality).
I want to integrate and design micro-services, some may share the same database connection or rely on another. But i just simply want to read data in a clean strongly-typed class entity, and maybe deal with insert/update somewhere else if i have to.
I would prefer to use Dapper instead of EF, but ABP is so heavily integrated with EF's design, it's more of a headache to avoid it, than it is to just go along with.
You should be able to map EF under ABP the same way as any other project using DB-first configuration.
The consistent approach I use for EF: (DB-First)
Define entities to match the table/view structure.
Define configuration classes extending EntityTypeConfiguration<TEntity> with the associated ToTable(), HasKey(), and any HasMany/HasRequired/HasOptional for relationships as needed.
In DbContext.OnModelCreating: modelBuilder.Configurations.AddFromAssembly(GetType().Assembly); to load all entity configurations. (assuming DbContext is in the same assembly as the models/configurations Substitute GetType().Assembly to point at the entity assembly.
Turn off Migrations. In DbContext constructor: Database.SetInitializer<MyDbContext>(null);
EF offers a lot more than simply mapping tables to classes. By mapping relationships between entities, EF can help generate optimized queries for retrieving data across those related entities. This can allow you to flatten data structures without returning unnecessary data, replace the need for views, and generally reduce the amount of data coming across the wire from the database to the application server.

Development process for Code First Entity Framework and SQL Server Data Tools Database Projects

I have been using Database First Entity Framework (EDMX) and SQL Server Data Tools Database Projects in combination very successfully - change the schema in the database and 'Update Model from Database' to get them into the EDMX. I see though that Entity Framework 7 will be dropping the EDMX format and I am looking for a new process that will allow me to use Code First in Combination with Database Projects.
Lots of my existing development and deployment processes rely on having a database project that contains the schema. This goes in source control is deployed along with the code and is used to update the production database complete with data migration using pre and post deployment scripts. I would be reluctant to drop it.
I would be keen to split one big EDMX into many smaller models as part of this work. This will mean multiple Code First models referencing the same database.
Assuming that I have an existing database and a database project to go with it - I am thinking that I would start by using the following wizard to create an initial set of entity and context classes - I would do this for each of the models.
Add | New Item... | Visual C# Items | Data | ADO.NET Entity Data Model | Code first from database
My problem is - where do I go from there? How do I handle schema changes? As long as I can get the database schema updated, I can use a schema compare operation to get the changes into the project.
These are the options that I am considering.
Make changes in the database and use the wizard from above to regenerate. I guess that I would need to keep any modifications to the entity and/or context classes in partial classes so that they do not get overwritten. Automating this with a list of tables etc to include would be handy. Powershell or T4 Templates maybe? SqlSharpener (suggested by Keith in comments) looks like it might help here. I would also look at disabling all but the checks for database existence and schema compatibility here, as suggested by Steve Green in the comments.
Make changes in code and use migrations to get these changes applied to the database. From what I understand, not having models map cleanly to database schemas (mine don't) might pose problems. I also see some complaints on the net that migrations do not cover all database object types - this was also my experience when I played around with Code First a while back - unique constraints I think were not covered. Has this improved in Entity Framework 7?
Make changes in the database and then use migrations as a kind of comparison between code and the database. See what the differences are and adjust the code to suit. Keep going until there are no differences.
Make changes manually in both code and the database. Obviously, this is not very appealing.
Which of these would be best? Is there anything that I would need to know before trying to implement it? Are there any other, better options?
So the path that we ended up taking was to create some T4 templates that generate both a DbContext and our entities. We provide the entity T4 a list of tables from which to generate entities and have a syntax to indicate that the entity based on one table should inherit from the entity based on another. Custom code goes in partial classes. So our solution looks most like my option 1 from above.
Also, we started out generating fluent configuration in OnModelCreating in the DbContext but have swapped to using attributes on the Entities (where attributes exist - HasPrecision was one that we had to use fluent configuration for). We found that it is more concise and easier to locate the configuration for a property when it is right there decorating that property.

Change Schema of Entity Framework

I'm using Entity Framework 5 on ASP MVC 4 web site I'm developing.
Because I am using shared hosting which charge for the number of databases I use I would like to run a test site near my production site.
I have two problems:
1) I use Code First and Database Migration. The migration classes seem to embed the schema dbo inside the name of the tables.
How can I change the schema according to the test/production flag
2) How can I change the schema from which EF select data?
Thank you,
Ido.
Both migration and EF take schema from mapping so if you want to change the schema you must update your mapping to use:
modelBuilder.Entity<MyEntity>().ToTable("MyTable", "MySchema");
and control the value of MySchema from configuration but this is really bad idea. One day you forget to change the value and break your production. Use local database for development and test.
As already said: use identical databases (structurally) for development, test and production.
The goal of schemas is to group database objects, like we do with namespaces in e.g. C#, or to simplify permissions for groups of database objects. Not to identify database stages. By using them for the latter you also make it much harder, if not impossible, to use schema appropriately. See for instance this MSDN white paper.
It is much easier to use some database name conventions to indicate their purpose.

DevExpress XPO vs NHibernate vs Entity Framework: database upgrading issue

What is the best practice for upgrading the database using ORM (DevExpress XPO, NHibernate or MS Entity Framework)?
I'm starting a new project and have to pick an ORM. The development process requires of releasing intermediate test builds quite often and likely that each build will have changes in the database structure. Each new version has to upgrade the DB gently to keep current data.
For old solutions I would provide a set of SQL scripts for upgrading the database from v1 to v2, from v2 to v3, etc. and execute them sequentially.
But how is it going to work for ORM? Should I still write SQL scripts to upgrade the DB?
I understand that simple adding new fields wouldn't cause a problem (e.g. see UpdateSchema() method for XPO), but what if I have to split a table and reallocate current records into 2 new tables?
I can't comment on the other ORM's, but I have used DevExpress XPO for a corporate treasury application since 2007. The schema changes a little with every release but there have also been some big schema changes over the years as well. A somewhat extended version of the default XPO upgrade mechanism has comfortably catered for all the changes.
There is good basic information here about upgrading XPO applications.
DevExpress provide a DBUpdater tool to assist you with the task of upgrading production environments. You can extend this tool to cater for additional requirements. In my application, we have added some options for logging, preview with rollback, etc.
Each module has virtual UpdateDatabaseBeforeSchemaUpdate() and UpdateDatabaseAfterSchemaUpdate() methods. You can significantly control the upgrade process within these.
As you mention, some of the upgrade will be handled automatically by XPO (e.g., adding a new column), but some things need additional control such as initialising the new column with a default value for existing records.
For instance, let's say MyNewField has been added to the MyEntity XPO class in version 2.0 of your application. Let's say it should default to a value of 3 for existing records. XPO will handle the creation of the new column but existing records will be NULL. (If you specify a default value in the XPO class it would only pertain to new records). In order to correct the value for existing records you would add something like the following to entity module's overridden UpdateDatabaseAfterSchemaUpdate():
public override void UpdateDatabaseAfterUpdateSchema()
{
base.UpdateDatabaseAfterUpdateSchema();
if (CurrentDBVersion < new Version(2, 0, 0, 0))
ObjectSpace.GetSession().ExecuteNonQuery(
"UPDATE [MyEntity] SET [MyNewField] = 3 WHERE [MyNewField] IS NULL");
}
(You could also use ObjectSpace.GetObjects<MyEntity>() and a foreach if you prefer to avoid the direct SQL.)
In your more extreme example of splitting a table in two, you can use the same method, but you would override UpdateDatabaseBeforeUpdateSchema() instead, run the SQL to split the table, let XPO perform any other schema updates and, if necessary, populate any default values in the UpdateDatabaseAfterUpdateSchema().
You will find that you bump into constraint problems e.g., foreign key violations so you might find you need to write some general routines such as DropAllForeignKeyConstraints() as part of the UpdateDatabaseBeforeUpdateSchema(). Sometimes you find that XPO already provide something, sometimes not. Missing constraints and indexes will get regenerated in the schema update. (In my experience switching a master data table's primary key turned out to be the hardest update routine to get right.)
By default the calls all happen in an SQL transaction so if anything fails it should all roll back.
The developers need to be aware of when a change to the domain model is likely to cause a problem with the underlying schema.
For testing, we keep a few old customer databases and run a bunch of before-and-after tests as part of the build process to make sure that existing customers are able to upgrade properly whatever version they are upgrading from. In production whenever we run into a problem upgrading, the problem data is added into this test library to prevent similar problems in the future.
We are dealing with major international companies and banks. The customers are quite happy with the result. In situations where a corporate's DBA needs to sign off on the changes, they don't seem to mind having a command line tool to do the upgrade rather than a script.
Most migration solutions can handle easy tasks, like adding new column, relationship or removing one, but fail to work when you rename a column (is that an add? or a remove following an add which equals a rename? What should you do with the data in that case?)
All three solutions have basic migrations support, XPO even lets you run your own scripts as a part of the process (to insert static/test/contant data, etc.)
There's also the MigratorDotNet project that you can use and not to rely on any ORM specific feature regarding migrations.
Personally, I would use auto migration only in dev/test environment and would have full set of upgrade scripts when running on client specific database to say upgrade from v1 to v2.
How is it going to work for ORM? Should I still write SQL scripts to
upgrade the DB?
Clear answer of this question should be on Programmer's stackexchange thread - What are the criteria for evaluating an ORM for.NET?, there i got simple answer for your question that you asked and matches with my experience with ORM while developing some project with Entity framework and Code smith ORM templates.
How does the ORM manages changes in the data model? what if I have to split a table and reallocate current records into 2 new tables?
Some can update the DB automatically within a certain measure, other
don't do anything and you'll have to do the dirty work yourself; other
provide a framework for handling change that lets you control database
updates. That means every couple of days someone needs to spend an hour updating the model to add a table or change datatypes that are changing
Ref:
https://softwareengineering.stackexchange.com/questions/6543/what-are-the-benefits-of-using-database-abstraction-by-orm
https://softwareengineering.stackexchange.com/questions/41739/best-arguments-for-against-introducing-orm-technology-into-a-companies-dev-proce/41833#41833
If you ask - what is the best practice for upgrading the db using ORM - my answer is: Don't use it if your application is more than a hobbyist app.
There are a lot of scenarios where many ORMs are unable to provide support to your specific database needs, e.g. in creating stored procedures, create indices and views or even indexed views/materialized tables without writing sql scripts. Problems like adding a new non-nullable column to an existing table are much harder to solve in ORM-Migration-Code than by writing SQL scripts.
Current Tools like Visual Studio Data Tools do handle these kind of problems way better.