Version control of databases

Version control of databases - version-control

I am curious if there are any solutions out there, preferably free, that can have a central database to publish data to in a versioned manner.
For example,
Client 1 decides to edit a persons profile so it gets a local copy on its machine to make changes to. When they are happy with there edit they publish the results to the central database. Just like how you would do a submit in perforce.
Client 2 tries to edit the same local copy but when they go to submit they have to resolve conflicts.
The central database must store compressed differences between versions of the data.
At any point someone can look at all versions of the data submitted.

Check out OffScale DataGrove.
This product tracks changes to the entire DB - schema and data. You can tag versions in any point in time, and return to older states of the DB with a simple command. It also allows you to create virtual, separate, copies of the same database so each team member can have his own separate DB. All the virtual copies are tracked into the same repository so it's super-easy to revert your DB to someone else's version (you simply check-out their version, just like you do with your source control). This means all your DBs can always be synchronized.
Disclaimer - I work at OffScale :-)

"Version control of databases" is a bit ambiguous for a title, because you are actually asking for a VCS using a database as repository "data store".
Subversion has such a model (either Berkeley DB or filesystem-based).
It also has a Copy-Modify-Merge model which is similar to the kind of locking mechanism you are describing.
(source: red-bean.com)
(source: red-bean.com)

The sql tools from redgate sort of offer some of this functionality, but not implemented in a way you describe. For example, sql data compare can compare the differences between data in 2 databases, and sql source control can be used as well.
However, getting a copy of the database on a local machine, making changes and resubmitting would be more of a manual process.

What database server are you using? If you are using MySQL and PHP, Doctrine has 'Versionable' behavior which can be applied to a model.
The documentation on this behavior is here:
http://www.doctrine-project.org/projects/orm/1.2/docs/manual/behaviors/en#core-behaviors:versionable

This is exactly what my product (yes I'm biased :)) DBmaestro Teamwork does.
It enforces and keep track on the changes of structure and content
It prevents two parallel changes on an object structure or content by two (as long they work on the same object - meaning, same database, same schema, ...)
It uses a baseline aware analysis which understand the nature of the change and knows if the change should be promoted or should be ignored (as it was made from another environment) or if there is a conflict
And much more…
I would encourage you to read a comprehensive, unbiased review on Database Enforced Management Solution by veteran Database expert Ben Taylor which he posted on LinkedIn https://www.linkedin.com/pulse/article/20140907002729-287832-solve-database-change-mangement-with-dbmaestro

Related

Any suggestions on the latest trend in version control for SQL Server 2014 and above?

For example when a developer makes changes in any of the database elements in a business critical database it should force them to commit the code before applying the changes to database itself. I came across Redgate sql source control which matches my expectation somehow. Still do we have any more tools or effective database practices that I am missing here?

If you use SQL Source Control or a tool like it (eg, ReadyRoll or VS Database Projects) I'd recommend also using DLM Dashboard.
The reason for this is that no tool can enforce changes to go through a process if people are given (too many) rights and are able to apply changes to production. It's then up to these people to correctly follow the process.
Although DLM Dashboard doesn't enforce changes to your database, it will alert you on changes made to production, warning you when out-of-process changes (aka "drift").
DLM Dashboard is free, which is another reason to use it!

How to deploy/versioning database with Cruise Control Net?

Hi i have configured the basics of cruise control to make releases, and automated nunit test using just MSBuild. Now i'm wondering if is possible to deploy/versioning databases with this?
I'm a beginner at CCNet .So if is possible some suggestions or tutorials (if there are) . Also if someone knows a free tool for database deployment/versioning let me know.. i will be grateful.
Thanks in advance
Hugh

It isn't free but SQL Source Control from RedGate can do what you're looking for, assuming it's a SQL Server database. It has a commandline interface that you can use in CCNet tasks. The easy approach of just migrating up is... easy, the changes are applied to your database schema / data. There was an issue with v2x of the tool that they've overcome with 3, which is that if you were to rename a table column then it would delete the column and create a new one with the right name. Obviously that's quite a big problem if you've got data you want to keep, so with v3 there's the concept of migrations and this allows you to specify alter scripts so instead of dropping the column you could script the change non-destructively.
As far as I know, at this time, they don't have anything that allows you to roll back your version.
Otherwise you could take a look at database migration tools, there seemed to be some promise for these in .Net at least. There is also this post that has some other tools (again for .net) and then there's this https://stackoverflow.com/search?q=database+migration+tool which is not restricted to any language but is general database migrations

If you're still looking for ways to version and migrate databases, one such tool is dbdeploy.net . I've hosted it on github after forking it and doing some work. Latest version is fully up to date and has some interesting features (done by someone who also uses it and sent a pull request).

SDLC: Managing changes in a 'Closed System' (M1 - ERP)

I am working with a client who has an ERP system in place, called M1, that they are looking to make custom changes to.
I have spent a little bit of time investigating the ERP system in terms of making customizations. Here is a list of what I have found with regards to custom changes:
Custom changes cannot be exported/imported. There is an option in the M1 Design Studio, however, they always appear to be disabled... I tried everything and I couldn't find a mention of it in the help documentation.
You can export a customizations change log (CSV, XML, Excel, HTML) that provides type, name, location and description. In essence, it is a read-only document that provides a list of changes you made. You cannot modify the contents of this log.
Custom form changes made, go into effect for all data sources (Test, Stage, LIVE). In other words, there does not appear an ability to limit the scope of a form change.
Custom field changes must be made in each data source (Test, Stage, LIVE). What's odd here is that if add a field in Test, adjust a grid to display it, subsequently change to LIVE, it detects that the field doesn't exist and negates the grid changes.
I'm unable to find documentation indicating that this application supports version control.
sigh
....
So...
How do I manage changes from an SDLC: ALM methodology and tools standpoint?
I could start by bringing in a change request system to manage pending and completed customizations. But then what? How should changes me managed and released? Put backups of application under source control and deploy when needed?
There might not be a good answer to this question since I'm unable to take advantage of version control and create a separation of environments, but I figured I'd ask in case anybody has had similar experience or worked with M1.

I take it from the lack of answers in two months that your question is unanswerable. SDLC is something you could write a textbook on, or read a textbook on, and not know enough about your environment, other than that probably in order to get hired at your shop, "SDLC" would be a bullet point on the hiring qualifications.
I have no experience with M1, but I am assuming that you're going to have to ask your peers at work for their ideas, because it sounds like you're asking a vertically closed (your shop, your tools, your practices) question that has no exact technical answer.
As for best practices; I suggest you investigate best practices outside your M1 ERP silo and apply them as makes sense to you.

The company I work for also uses M1 erp. We have similar issues regarding version control of the customisations. From what I can tell, all customisations are stored in the M1DD database. You could backup a copy of this database before any major development work as a basic revision control system.
I am familiar with the issue of all changes becoming immediately active in all datasets. This is particularly annoying when you are making changes to a commonly used modules as you don't know how live data will be affected during the development process. One technique I have found useful is to surround untested code with an if statement so it is only executed when I am logged in.
If App.UserID = "MYUSERNAME" Then
'new code here
End If
I would be interested in hearing how you solved this problem.

do any source-control systems use a document database for storage?

One of those questions that's difficult to google.
We were running into issues the other day with speed of our svn repository. The standard solution to this seems to be "more RAM! more CPU!" etc. Which got me to wondering, are there any source-control systems that use a document/nosql database (mongodb, couchdb etc) for database? It seems like it might be a natural -- but I'm no expert on source-control database theory. Perhaps there's a way to configure a more recent source control to use a document db as storage?

None that I know of do, and they wouldn't want to. Given the difference in degrees of testing, it would likely hurt robustness (a really bad thing for a source code repository). It would probably also end up hurting performance, because of the inability to do delta storage.
Note that Subversion has two very different storage mechanisms, one backed by the embedded Berkeley DB, and the other backed by simple files. One or the other of these might be better suited to your usage.
Also, since you posed your question pretty broadly, I'll comment on Git and TFS.
Git uses very efficiently packed files in the filesystem to store the repository. Frequently, the entire history is smaller than a checkout. For one very old project that my lab has, the entire history is 57MiB, and a working tree (not counting history) is 56MiB.
TFS stores a lot (possibly all) of its data in a SQL database.

Git uses memory-mapped files just like MongoDB :)
Though Git doesn't actually use MongoDB and I don't think it would want to. If you look at Git, it doesn't really need a NoSQL DB, it basically is a DB.

As far as i know no of the VCS uses noSQL/document based databases. The idea of using a couchdb etc. is not new...but no one has implemented such a thing till now...

Database Versioning - How does branch switching work?

This is a question for those of you developing on a team of devs where all of you have separate databases. You're versioning your database using source control and other tools which will automatically bring dev databases up to date to the latest version of the database (schema, data, SP's, functions, etc.).
OK Great! But wait! What if you are developing on version 4.0 of your software, but now you need to switch branches to the 3.2 branch to fix a bug? The schema could be (almost assuredly is) very different by now...
I suppose if you went through the extra effort to write rollback scripts along with your change scripts, this could work. But that seems like a lot of work - is it really worth it?

Much easier would be to create a new 3.2-branch database and work with that while working on the 3.2-branch code. It doesn't seem reasonable to me to require that each developer has exactly one database to work with.

I'm going on a limb and assume that you are versioning the database as a binary? If all your database assets were in the form of constructive code (eg SQL scripts and/or text data dumps), the solution would be simple, as suggested by Mark: store these assets as part of the development branch. To work on version 3.2, switch the branch, re-run the create scripts and presto, 3.2 database. Merging would be just as easy as with regular code (or just as painful, depending on your version control system of choice).
Here are some suggestions to work in this mode:
If creating the database instances from text is too slow, make a cache on a shared disk volume, keyed by the contents of all the schema / data files (or the MD5 sum thereof).
Write a pre-commit hook to ensure that the schema and data dumps in the developer's instance are the same as the ones under version control. This prevents people from making changes to their dev database with an interactive tool, and then forgetting to commit them.
You mention change scripts; treat them as a liability. While they may be required by your deployment scenario (eg for customers who want to upgrade in-place), they duplicate information from the version history of the database, and per Murphy's law duplication means desynchronization sooner or later. Try to auto-generate the change scripts from the versioned database assets using "diff"; or if this cannot be achieved, dedicate some serious unit tests to database upgrades.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse