I'm leading a small software development team (4 people), and have just broken ground on a source-controlled SQL Server 2008 database project, with isolated development databases for each developer. I'm still implementing this one step at a time, but I'm envisioning each developer having their own database, with a naming scheme something like <ProjectName>_DEVELOPMENT_<TFSUserName>. This was all recommended per the MSDN articles I've been reading, but someone let me know if that sounds way off.
Anyway, we have a shared application solution that we've been developing for some time. In the past, we had no database version control, and just modified our database directly from SQL Server Management Studio when new reference data needed to be populated, or when we were testing functionality -- one change immediately affected everyone else. So with this new change, I'm wondering what the best way would be to have each person connect to their isolated development databases from the application solution. Prior to isolated databases, our connection to the database was specified in our application's web.config as a connection string. If we're each going to have our own database, the only way I can see it working is for each developer to set their connection string in their local solution to point to their personal database. But changing the web.config will check out that file in the solution, so developers will always have to specifically uncheck that file when checking in application changes to the baseline. Is there a less clunky way for each developer to use their isolated database when doing application testing?
I recommend that you not make the database names username-specific. Instead make the database the same name for each developer and always reference it via localhost (localhost\<ProjectName>_DEVELOPMENT). Then the same connection string will work for every developer.
MSDN's suggestion to use username-specific databases is better for a shared development environment. It's definitely not ideal for a localized environment.
Related
I have this challenge. I am the DevOps engineer and a software engineer in a team where months back, the developers moved from having a central Oracle DB to having the DB on a CentOS VM on their individual laptops. The move from a central DB was to reduce dependency on the DBAs and also to eliminate issues that stemmed from inconsistent data.
The plan for sharing and ensuring synchronization of the Database with everyone on the team was that each person will share change scripts with everyone. The problem is that we use Skype for communication (we just setup slack but are yet to start using it fully), and although people sometimes post the text of DB change scripts, it could be missed by some. The other problem is that some developers miss posting the changes. Further, new releases are deployed in Production without being deployed on the Test and Demo environments.
This has posed a serious challenge for us, especially myself who of recent, became responsible for ensuring that our Demo deployments were in sync with the Production deployments.
Most of the synchronization issues border on the lack of sync of the Database due to missing change scripts or missing DB objects. Oracle is our DB of preference.
A typical deployment in the Demo environment is a very painful process that involves testing an application and as issues occur due to missing DB table columns, functions, stored procs, we have to look for the missing DB objects, apply them to the DB and then continue until all issues are resolved.
How can I solve this problem to ensure smooth, painless and less time-consuming deployments? Can migrating our applications to Docker help with the DB synchronization issues and the associated lack of discipline of the developers? What process can we put into place to improve in this area?
Thank you very much in advance for your help.
Have a look # http://www.dbmaestro.com
I strongly recommend you to join the live demo session
DBmaetro TeamWork can help you merge the changes from multiple DBs into a single shared DB and to move safely the changes from one environment to the other
Danny
For example when a developer makes changes in any of the database elements in a business critical database it should force them to commit the code before applying the changes to database itself. I came across Redgate sql source control which matches my expectation somehow. Still do we have any more tools or effective database practices that I am missing here?
If you use SQL Source Control or a tool like it (eg, ReadyRoll or VS Database Projects) I'd recommend also using DLM Dashboard.
The reason for this is that no tool can enforce changes to go through a process if people are given (too many) rights and are able to apply changes to production. It's then up to these people to correctly follow the process.
Although DLM Dashboard doesn't enforce changes to your database, it will alert you on changes made to production, warning you when out-of-process changes (aka "drift").
DLM Dashboard is free, which is another reason to use it!
I've recently decided to embark on a fun / educational personal project to create some data visualizations and power metrics for my fantasy football league. Since ESPN doesn't provide an API, I've decided to use a combination of elbow grease and the nfldb to pull relevant data (and am hoping to get familiar with Plotly for presenting the data). In setting up nfldb, I'm also getting my first exposure to databases, using postgresql in particular (as required by nfldb).
Since the installation guide provided by nfldb is Linux-centric and assumes a fair bit of previous database experience, I've looked to this guide for help and blindly followed its instructions in hopes of sidestepping postgresql (aka the "just make it work" "solution"). Of course, that didn't work, and I have no idea how to diagnose the problem(s), so I've decided to go ahead and use this opportunity to get a little familiar with databases / postgresql.
I've looked to the postgresql documentation for guidance. Having never worked in a server / client environment, the following text (from "18.1. The PostgreSQL User Account") has me particularly confused:
As with any server daemon that is accessible to the outside world, it is advisable
to run PostgreSQL under a separate user account. This user account should only own
the data that is managed by the server, and should not be shared with other
daemons. (For example, using the user nobody is a bad idea.) It is not advisable
to install executables owned by this user because compromised systems could then
modify their own binaries.
To add a Unix user account to your system, look for a command useradd or adduser.
The user name postgres is often used, and is assumed throughout this book, but you
can use another name if you like.
I'd really appreciate a well annotated version of these paragraphs. How does it apply to someone like me, storing and accessing date on the same machine? Do I need to create a new system user account? How do I make sure it "only owns the data that is managed by the server"? Where is the responsible location to install postgresql? Am I exposed to some sort of security risk by downloading the nfldb database? Why is the user nobody a bad idea?
Relevant: I am using a Mac (v10.11.6) and plan to install (or re-install, if necessary) postgresql using Homebrew.
This is a question for those of you developing on a team of devs where all of you have separate databases. You're versioning your database using source control and other tools which will automatically bring dev databases up to date to the latest version of the database (schema, data, SP's, functions, etc.).
OK Great! But wait! What if you are developing on version 4.0 of your software, but now you need to switch branches to the 3.2 branch to fix a bug? The schema could be (almost assuredly is) very different by now...
I suppose if you went through the extra effort to write rollback scripts along with your change scripts, this could work. But that seems like a lot of work - is it really worth it?
Much easier would be to create a new 3.2-branch database and work with that while working on the 3.2-branch code. It doesn't seem reasonable to me to require that each developer has exactly one database to work with.
I'm going on a limb and assume that you are versioning the database as a binary? If all your database assets were in the form of constructive code (eg SQL scripts and/or text data dumps), the solution would be simple, as suggested by Mark: store these assets as part of the development branch. To work on version 3.2, switch the branch, re-run the create scripts and presto, 3.2 database. Merging would be just as easy as with regular code (or just as painful, depending on your version control system of choice).
Here are some suggestions to work in this mode:
If creating the database instances from text is too slow, make a cache on a shared disk volume, keyed by the contents of all the schema / data files (or the MD5 sum thereof).
Write a pre-commit hook to ensure that the schema and data dumps in the developer's instance are the same as the ones under version control. This prevents people from making changes to their dev database with an interactive tool, and then forgetting to commit them.
You mention change scripts; treat them as a liability. While they may be required by your deployment scenario (eg for customers who want to upgrade in-place), they duplicate information from the version history of the database, and per Murphy's law duplication means desynchronization sooner or later. Try to auto-generate the change scripts from the versioned database assets using "diff"; or if this cannot be achieved, dedicate some serious unit tests to database upgrades.
I am curious if there are any solutions out there, preferably free, that can have a central database to publish data to in a versioned manner.
For example,
Client 1 decides to edit a persons profile so it gets a local copy on its machine to make changes to. When they are happy with there edit they publish the results to the central database. Just like how you would do a submit in perforce.
Client 2 tries to edit the same local copy but when they go to submit they have to resolve conflicts.
The central database must store compressed differences between versions of the data.
At any point someone can look at all versions of the data submitted.
Check out OffScale DataGrove.
This product tracks changes to the entire DB - schema and data. You can tag versions in any point in time, and return to older states of the DB with a simple command. It also allows you to create virtual, separate, copies of the same database so each team member can have his own separate DB. All the virtual copies are tracked into the same repository so it's super-easy to revert your DB to someone else's version (you simply check-out their version, just like you do with your source control). This means all your DBs can always be synchronized.
Disclaimer - I work at OffScale :-)
"Version control of databases" is a bit ambiguous for a title, because you are actually asking for a VCS using a database as repository "data store".
Subversion has such a model (either Berkeley DB or filesystem-based).
It also has a Copy-Modify-Merge model which is similar to the kind of locking mechanism you are describing.
(source: red-bean.com)
(source: red-bean.com)
The sql tools from redgate sort of offer some of this functionality, but not implemented in a way you describe. For example, sql data compare can compare the differences between data in 2 databases, and sql source control can be used as well.
However, getting a copy of the database on a local machine, making changes and resubmitting would be more of a manual process.
What database server are you using? If you are using MySQL and PHP, Doctrine has 'Versionable' behavior which can be applied to a model.
The documentation on this behavior is here:
http://www.doctrine-project.org/projects/orm/1.2/docs/manual/behaviors/en#core-behaviors:versionable
This is exactly what my product (yes I'm biased :)) DBmaestro Teamwork does.
It enforces and keep track on the changes of structure and content
It prevents two parallel changes on an object structure or content by two (as long they work on the same object - meaning, same database, same schema, ...)
It uses a baseline aware analysis which understand the nature of the change and knows if the change should be promoted or should be ignored (as it was made from another environment) or if there is a conflict
And much moreā¦
I would encourage you to read a comprehensive, unbiased review on Database Enforced Management Solution by veteran Database expert Ben Taylor which he posted on LinkedIn https://www.linkedin.com/pulse/article/20140907002729-287832-solve-database-change-mangement-with-dbmaestro