I'm sure this is a dumb question, but I'm wondering if somebody could please inform me of best practice here. I'm using patroni and pgbackrest for cluster management and backup, and in the case of somebody screwing something up I want them to be able to restore to a point in time. Then, if they don't like that restoration, to be able to re-restore to another point in time, e.g. 5 minutes in the future or the past.
The problem is, the first restore happens and then the database promotes and starts a new timeline. Now all of a sudden, the user has to figure out the old timeline id and enter that in to start moving around on that timeline. Is there a better way? E.g. after restoration the database stays in read-only mode until they get to the point they want and then they can certify and promote the database manually?
Thanks!
I am not sure if I understand the question correctly, but the second restore can specify
recovery_target_timeline = 'current'
to remain on the timeline that was active when recovery started.
Related
I've got a database on Google SQL that is used by our application running on kubernetes in GKE.
The mysql instance is running on 5.6, and I need to update it to 5.7, so I tried using the new migration jobs.
I've set up the connection profile and all the required permissions for the source DB, then followed the instructions to make a continuous migration.
The Job says it's running, migrating the ~450GB database. After about a day, it's still running, the storage used seems to have stopped growing, and the replication delay is at 0. The source database is not currently in use (That's why I'm unsing it to try this out before doing the same with a more important db).
According to this, if the dump phase is done, I should be able to promote the instance, but the promote button remains greyed out, and there's no way to check the running state (it only says "running", and I don't see any way to check if it's dumping, on CDC, or anything else).
The documentation seems a bit lacking, and I couldn't find anything by googling around. Has anyone been using this?
In short, my questions are:
Why can't I promote the instance?
and how can I check in what phase is the migration?
Here's a screencap of my job:
link because SO doesn't let me embed images yet
Thanks.
p.d.: the tag that the documentation says should be used in stackoverflow is: google-cloud-database-migration-service, which is too long and stackoverflow doesn't allow, so I used google-cloud-sql instead :/
I am seeing an issue like this, but possibly more frustrating. After a week for a 2TB database, storage resets to near-zero and the full dump restarts, without any errors or indication of what happened.
I have a MongoDB client in three EC2 instances and I have created a replica set. Last time I had a problem, of space constraint which stopped my mongod process, thereby halting the application and now in an instance couple of days back, some of my tables were gone from database, so I set logging and all to my database just to catch if anything like that happens again. In a fresh incident this morning I was unable to login to my system and that's when I found out that whole database was empty. I checked other SO question like this which suggest setting up a TTL.Which I haven't done at all.
Now how do I debug this situation and do a proper root cause analysis? I can't even find anything in my debug logs as well. The tables just vanished. How do I set up proper logging mechanism and how do I ensure that all my tables are never ever deleted again?
Today I got a mail from Amazon that I was probably running an unsecured version of MongoDB and that may have caused this issue. So who ever is facing this issue please go through the Security Checklist Provided by MongoDB. There are some points that are absolutely necessary in there.
1. Enable Access Control and Enforce Authentication
2. Encrypt Communication
3. Limit Network Exposure
These three are the core and depending upon how many people access your database you can Configure Role-Based Access Control.
These are all the things I have done. Before this incident I had not taken security that seriously but after I was hit by it. I made sure I have all the necessary precautions in place.
Hope this helps someone.
Our Main branch was apparently just deleted and there's no record of why. (The branch still appears in Source Control Explorer - When I view the history of the branch it's empty). When I get latest on the branch it deletes everything locally. We have numerous children branches that all appear to be fine, but Main is now empty with no record of how/why. Anybody have any idea how we can figure out what happened and recover it? We have a child branch that should be a duplicate so we should be OK, but we'd really like to figure out what happened!
What may have happened
There are a few things I can think of, the most logical in this case is that someone issued a tf destroy $/project/Branch/* /recursive, that would have the observed effect.
It could also be that someone has renamed the branch, that would not be visible in the history per se, unless you turn on the "Show Deleted Items" option in the options of the Team Foundation Source control options.
Your Application Tier's version control cache may have become corrupted, the chance of this happening is very slim, but it may have caused this. Ensure you have a good backup of your databases even if this may seem the case, if it isn't you're going to need the database backup and the older it is, the more unlikely it is data marked for deletion will still be there.
How can you find out what happened?
Check the tbl_command in the Project Collection Database or access the hidden _oi activity log page on the web access server. You may be able to find the command that caused the deletion.
If that doesn't tell you, analyze the transaction logs of the SQL Server (if your server is configured to keep these).
What to do now?!
Make a backup of your TFS server or secure the ones you have if you haven't done so
If the version control cache is the culprit clearing it (on Application Tier machines) may solve your problem, the cache location shows on the TFS Admin Control panel:
Best way to go about this, is to stop the TFS server temporarily and then delete the contents of this folder.
There seem to be a few ways out:
Forget about it, take the contents of the most up-to-date branch and use that to repopulate the missing data. Just add them to the empty folder, check them in and then re-merge all other branches and resolve all conflicts.
Pro: Fast
Con: you loose history, resolving conflicts will be a horrible task.
Restore the project collection database to a previous point in time (warning! may require restore of all project collections to a previous point in time)
Pro: You get all your history back
Con: You loose changes made since the last known good backup, takes alot of work, will impact all projects in the same collection, possibly all projects on the same server.
Restore the whole server to temporary server and restore the collection with the missing data to the last known good configuration. Use a tool like OpsHub or Team Foundation Migration Toolkit to replay the changes since the disaster.
Pro: You get back to the most up to date point in time
Con: Takes a lot of time and expertise in TFS Migration
Restore the collection database and use the transaction logs to replay as much of the changes to the collection , then skip the transactions that perform the destroy. Be careful though, usually the destroy action marks files as deleted, but a job does the actual deletion in the background.
Pro: You get back to the most up to date point in time
Con: Takes a lot of time and expertise in SQL
Contact Microsoft Support and get a Field expert in the house. They may be able to restore the deletion if it was done without immediately triggering the cleanup job.
Pro: You will get back into the best state possible
Con: it will be costly
Whatever you do, make sure you have a backup of your current situation, that allows you to try different tactics, should your first attempts fail.
Consider splitting the project collection to allow other projects to continue working. You will end up in a situation were this one project ends up in an isolated Project Collection on its own, but it will allow you to move forward quickly.
OK - this is one for the record books, because inexplicably the project reappeared later in the day. All of it's history is back as well. I would have thought that perhaps the DBAs here did a database restore, but that's not possible since all of the checkins that have been happening all day are still there.
So if this happens to you in the future, just cross your fingers and wait a few hours!
p.s. I did look in the SQL logs but couldn't find anything. Bizarre!
We currently have one publisher and four subscribers using merge replication. Due to a change in the schema somebody performed a “Reinitialize All subscriptions” action without checking the “Upload the changes at the subscriber before reinitializing” option. When the replication agent for the first server was started, the database was cleaned out. (All tables dropped and recreated) And all of the changes since the last successful synchronization were lost. At this point we decided to disable the replication schedule completely. My question is, is there a way to undo the “Reinitialize All subscriptions” action? Preferably, in such a way, that all of the changes at the subscribers aren’t lost.
Thanks in advance,
David
We were able to restore a backup of the publisher database prior to the reinitialize action. (This was done after creating a separate backup for the current publisher database.) After this we manually re-applied the changes which had been done since the reinitialize action from the database with the reinitialize action in it to the restored backup. (We used Redgate sql data compare). At this point we were able to start the replication process and everything worked as it should. So apparently the snapshot information is completely stored inside the database to which it applies.
A special thanks to Hilary Cotter for pointing this out.
i know this is an often asked question on these boards. And usually the question has been about how to manage the changes being made to the database before you even get around to deploy them.Mostly the answer has been to script the database and save it under sourcecontrol and then any additional updates are saved as scripts under version control too.(ex. Tool to upgrade SQL Express database after deployment)
my question is when is it best to apply the database updates , in the installer or when the new version first runs and connects to the database? note this is a WinApp that is deployed to customers each have their own databases.
One thing to add to the script: Back up the database (or at least the tables you're changing!) before applying the changes.
As a user I think I'd prefer it happens during the install, and going a little further that the installer can roll itself back in the event of a failure. My thinking here is that if I am installing an update, I'd like to know when the update is done that it actually is done and has succeeded. I don't want a message coming up the next time I run it informing me that something failed and I've potentially lost all my data. I would assume that a system admin would probably also appreciate install time feedback (of course, that doesn't matter if your web app isn't something that will be installed on a network). Also, as ראובן said, backing up the database would be a nice convenience.
You haven't said much about the architecture of the application, but since an installer is involved I assume it's a client/server application.
If you have a server installer, that's where you want to put it, since the database structure is only going to change once. Since the client installers are going to need to know about the change, it would be nice to have a way to detect the database version change, and for the old client to be able to download the client update from the server automatically and apply it.
If you only have a client installer, I still think it's better to put it there (maybe as a custom action that fires off the executable for updating the database). But it really isn't going to matter, because conceptually one installer or first-time user of the new version is going to have to fire off the changes to the database anyway. The database changes are going to put structural locks on the database so, in practical terms, everyone is going to have to be kicked off the system at that time for the database update to be applied.
Of course, this is all BS if it's not client-server.