I have a big backup set with 50 databases. Migrating to a new server is common. Restoring manually trough MS is too tedious and slow. Using restore TSQL with file is faster, but the position of databases in the list isn't always sequential, so looking up the numbers is again, a drag. I need a script to speed things up.
I need to get the names and positions of databases in the backup set, put them in a cursor, then go trough it and restore using those same names (and file positions). But how to get the names and positions?
Related
I am using the restore db to import backup into a test environment database that works fine, but I need to extend this import process to several backups from several dates into a unique test environment db... What is the command to append backups into a unique database...
thanks
Phil
You write "I get only full offline database backup files from several time date (2 weeks of production) that I need to restore in a test database for analysis... example 4 backups files of 2 weeks of data = 2 months of data ...."
and you also write "What is the command to append backups into a unique database..."
While there is no explicit method for full-offline-backup images to be combined for Db2-LUW, there's always another way to get what you need...given the right skills and tools.
IF you have a FULL backup image, it can either be restored to a new database, or it can fully overwrite an existing database. If you have 4 FULL backup images, each can be restored either into a (uniquely named) database (or overwrite 4 existing databases).
You can also restore specific tablespaces from a backup image, if properly configured. Some sites have designed discrete tablespaces for specific time periods (one per day/week/month) to help with such activities. Some sites have designed their tables to be range partitioned (with each time period having its own partition (and sometimes dedicated tablespaces also), and this makes subsequent merging of content more easy with the right skills.
If you are competent with scripting, you can restore the first (earliest) image, export the relevant table contents to flat-files, restore the next backup image and export the relevant tables to new flat-files (repeat as needed), then load these flat-files into a table for analysis. If your database size is small then this can be considered a keep-it-simple approach.
You can also do clever things with federation if you restore to discrete databases.
Separately purchasable tools exist to let you extract selected content from a backup image (which can then be loaded into a Db2 database), without needing to do a restore action. These are not included with the Db2 product. So you could extract specific table contents from a backup image if you pay for the right tools and learn how to use them. Speak with your IBM Salesperson. Such tools may require currently supported versions of Db2 however.
We are using pgbackrest to backup our database to Amazon S3. We do full backups once a week and an incremental backup every other day.
Size of our database is around 1TB, a full backup is around 600GB and an incremental backup is also around 400GB!
We found out that even read access (pure select statements) on the database has the effect that the underlying data files (in /usr/local/pgsql/data/base/xxxxxx) change. This results in large incremental backups and also in very large storage (costs) on Amazon S3.
Usually the files with low index names (e.g. 391089.1) change on read access.
On an update, we see changes in one or more files - the index could correlate to the age of the row in the table.
Some more facts:
Postgres version 13.1
Database is running in docker container (docker version 20.10.0)
OS is CentOS 7
We see the phenomenon on multiple servers.
Can someone explain, why postgresql changes data files on pure read access?
We tested on a pure database without any other resources accessing the database.
This is normal. Some cases I can think of right away are:
a SELECT or other SQL statement setting a hint bit
This is a shortcut for subsequent statements that access the data, so they don't have t consult the commit log any more.
a SELECT ... FOR UPDATE writing a row lock
autovacuum removing dead row versions
These are leftovers from DELETE or UPDATE.
autovacuum freezing old visible row versions
This is necessary to prevent data corruption if the transaction ID counter wraps around.
The only way to fairly reliably prevent PostgreSQL from modifying a table in the future is:
never perform an INSERT, UPDATE or DELETE on it
run VACUUM (FREEZE) on the table and make sure that there are no concurrent transactions
So, I have a PostGreSQL DB. For some chosen tables in that DB I want to maintain a plain dump of the rows when modified. Note this dump is not a recovery or backup dump. It is just a file which will have the incremental rows. That is, whenever a row is inserted or updated, I want that appended to this file or to a file in a folder. Idea is to load that folder into say something like hive periodically so that I can run queries to check previous states of certain rows, columns. Now, these are very high transactional tables and the dump does not need to be real time. It can be in batches, every hour. I want to avoid a trigger firing hundreds of times every minute. I am looking for something which is off the shelf - already available in PostGreSQL. I did some research but everything is related to PostGreSQL backup - which is not the exact use case.
I have read some links like https://clarkdave.net/2015/02/historical-records-with-postgresql-and-temporal-tables-and-sql-2011/ Implementing history of PostgreSQL table etc - but these are based on insert update trigger and create the history table on PostGreSQL itself. I want to avoid both. I cannot have the history on PostGreSQL as it will be huge soon. And I do not want to keep writing to files through a trigger firing constantly.
Trying find an example or a starting point for a project I have to restore databases into a test environment. I have a list of 40+ sql instances, databases, and backup location and like to use the cmdlet Restore-SQLDatabases but only allow 3 restores to occur at a time. To minimize the impact on our network/storage I don't want to initiate all 40+ restores at one time. The list of what needs to be restored are contained in a csv and when testing can get the restores to go but not sure what options I'd have to only thread only 3 at a time.
I used the RunspaceFactory example and modified it to use a script-block to execute Restore-SqlDatabase. I'm sure there may be cleaner or simpler ways of doing this but so far it seems to work.
Ive used the Data Comare tool to update schema between the same DB's on different servers, but what If so many things have changed (including data), I simply want to REPLACE the target database?
In the past Ive just used TSQL, taken a backup then restored onto the target with the replace command and/or move if the data & log files are on different drives. Id rather have an easier way to do this.
You can use Schema Compare (also by Red Gate) to compare the schema of your source database to a blank target database (and update), then use Data Compare to compare the data in them (and update). This should leave you with the target the same as the source. However, it may well be easier to use the backup/restore method in that instance.