PG_restore. What does it do?

PG_restore. What does it do? - postgresql

I read these docs:
Description
pg_restore is a utility for restoring a PostgreSQL database from an
archive created by pg_dump in one of the non-plain-text formats. It
will issue the commands necessary to reconstruct the database to the
state it was in at the time it was saved. The archive files also allow
pg_restore to be selective about what is restored, or even to reorder
the items prior to being restored. The archive files are designed to
be portable across architectures.
pg_restore can operate in two modes. If a database name is specified,
pg_restore connects to that database and restores archive contents
directly into the database. Otherwise, a script containing the SQL
commands necessary to rebuild the database is created and written to a
file or standard output. This script output is equivalent to the plain
text output format of pg_dump. Some of the options controlling the
output are therefore analogous to pg_dump options.
Obviously, pg_restore cannot restore information that is not present
in the archive file. For instance, if the archive was made using the
"dump data as INSERT commands" option, pg_restore will not be able to
load the data using COPY statements.
but it's still unclear to me if pg_restore just loads database data or if it also creates the structure of the database too.

It depends on the options you pass, and obviously in the stored information in the dump. If you keep reading through the documentation you will see this option:
--data-only
Restore only the data, not the schema (data definitions). Table data, large objects, and sequence values are restored, if
present in the archive.
This option is similar to, but for historical reasons not identical to, specifying --section=data.
That is obviously allowing you to restore only the schema but no the data.

Related

What is the easiest way to generate a script to drop and create all objects in a database?

I'm used to working with SQL Server and the SQL Server Management Studio has the option to automatically generate a script to drop and recreate everything in a database (tables/views/procedures/etc). I find that when developing a new application and writing a bunch of junk in a local database for basic testing it's very helpful to have the options to just nuke the whole thing and recreate it in a clean slate, so I'm looking for a similar functionality within postgres/pgadmin.
PGAdmin has an option to generate a create script for a specific table but right clicking each table would be very tedious and I'm wondering if there's another way to do it.

To recreate a clean schema only database you can use the pg_dump client included with a Postgres server install. The options to use are:
-c
--clean
Output commands to clean (drop) database objects prior to outputting the commands for creating them. (Unless --if-exists is also specified, restore might generate some harmless error messages, if any objects were not present in the destination database.)
This option is ignored when emitting an archive (non-text) output file. For the archive formats, you can specify the option when you call pg_restore.
and:
-s
--schema-only
Dump only the object definitions (schema), not data.
This option is the inverse of --data-only. It is similar to, but for historical reasons not identical to, specifying --section=pre-data --section=post-data.
(Do not confuse this with the --schema option, which uses the word “schema” in a different meaning.)
To exclude table data for only a subset of tables in the database, see --exclude-table-data.

clean in Flyway
The database migration tool Flyway offers a clean command that drops all objects in the configured schemas.
To quote the documentation:
Clean is a great help in development and test. It will effectively give you a fresh start, by wiping your configured schemas completely clean. All objects (tables, views, procedures, …) will be dropped.
Needless to say: do not use against your production DB!

What is the best way to backfill old database data to an existing Postgres database?

A new docker image was recently stood up to replace an existing postgres database. A dump was taken of the database before the old instance was shut down using the following command:
pg_dump -h localhost -p 5432 -d *dbname* -U postgres > *dbname*.pgdump
We'd like to concatenate or append this data to the new database in order to "backfill" some older historical data. The database name and schema of the two databases is identical. What is the easiest, safest way to do this? Secondly, need postgres be shut down during the process?

If overlapping primary keys or unique columns have been assigned to the new data, then there will be no clean way to merge them without putting in some work to clean that up. Assuming that hasn't happened...
The current dump file will have create statements for all the objects that already exists. If you replay that file into the current database, you will get a bunch of errors for all those objects. If you don't have it all run in one transaction, then you could simply ignore those errors. But, you might also load data in the wrong order and get foreign key violations. Those errors will be mixed in with all the other ones about existing object, so might be easy to overlook.
So what I would do is stand up an empty database server, and replay your current dump into that. Then retake the pg_dump, but with either -a or --section=data. Then you should be able to load that dump into your new database. This has two advantages, it will not dump out CREATE statements which are not needed and throw errors which would need to be ignored, and it should dump the tables in an order which will not cause foreign key violations.

Managing foreign keys when using pg_restore with multiple dumps

I have a bit of a weird issue. We were trying to create a database baseline for our local environment that has very specific data pre-seeded into it. Our hopes were to make sure that everyone was operating with the same data, making collaboration and reviewing code a bit simpler.
My idea for this was to run a command to dump the database whenever we run a migration or decide a new account is necessary for local dev. The issue with this is the database dump is around 17MB. I'm trying to avoid us having to add a 17MB file to GitHub every time we update the database.
So the best solution I could think of was to setup a script to dump each individual table in the database. This way, if a single table is updated, we'd only be pushing that backup to GitHub and it would be more along a ~200kb file as opposed to 17mb.
The main issue I'm running into with this is trying to restore the database. With a full dump, handling the foreign keys is relatively simple as it's all done in a single restore command. But with multiple restores, it gets a bit more complicated.
I'm looking to find a way to restore all tables to a database, ignoring triggers and constraints, and then enabling them again once the data has been populated. (or find a way to export the tables based on the order the foreign keys are defined). There are a lot of tables to work with, so doing this manually would be a bit of a task.
I'm also concerned about the relational integrity of the database if I disabled/re-enable constraints. Any help or advice would be appreciated.
Right now I'm running the following on every single table:
pg_dump postgres://user:password#pg:5432/database -t table_name -Fc -Z9 -f /data/www/database/data/table_name.bak
And then this command to restore all backups to the DB.
$data_command = "pg_restore --disable-triggers -d $dbUrl -Fc \"%s\"";
$backups = glob("$directory*.bak");
foreach($backups as $data_file){
if($data_file != 'data_roles.bak') {
exec(sprintf($data_command, $data_file));
}
}
This obviously doesn't work as I hit a ton of "Relationship doesn't exist" errors. I guess I'm just looking for a better way to accomplish this.

I would separate the table data and the database metadata.
Create a pre- and post-data scfipt with
pg_dump --section=pre-data -f pre.sql mydb
pg_dump --section=post-data -f post.sql mydb
Then dump just the data for each table:
pg_dump --section=data --table=tab1 -f tab1.sql mydb
To restore the database, first restore pre.sql, then all the table data, then post.sql.
The pre- and post-data will change often, but they are not large, so that shouldn't be a problem.

How to restore database in PostgreSQL with pgadmin3?

I'm using pgAdmin to restore PostgreSQL database. To restore the database I need to delete, drop and remake it. How to restore the database without deleting and remaking it?

This cannot be done in pgAdmin or with any database tools. Regular backup files cannot be restored without deleting the data first because they consist of normal COPY statements which will fail if you have rows in the database (primary keys collide etc).
For a simple way to get back to an earlier snapshot in a testing environment take a look at PostgreSQL documentation - 24.2. File System Level Backup:
For backup:
Shut down your database
copy all the files from your data directory
For restore:
Shut down your database
replace your data directory with the backup directory
Note:
the size of the data might be significantly larger than with a regular backup especially if you have a lot of indexes
this is a server wide backup so you can't do this on individual databases
don't attempt to use it on a different version of PostgreSQL
this really deletes the data too - by replacing it with the backup
Also with regular backups you don't have to do a DROP TABLE if you do a data-only restore with pg_restore --data-only for example. You still have to delete the data though.

Copy a live Firebird .fdb database

I want to make a copy of a live Firebird .fdb database.
I understand that simply copying it could result in a corrupted database and I looked into using the gbak command because it is able to perform a backup while the database is running.
So that will give me a database backup but I would need to restore this before I can use it. My database is nearly 1GB and takes 10 minutes to restore which is too long. Is there any other method to simply copy a Firebird live database from one location to another safely?
So far I have used the following to backup (which works):
gbak -v -t -user SYSDBA -password "masterkey" 127.0.0.1:"C:/Files/Live/Database.fdb" "C:\Test\Test.fbk"
I also tried using the following to backup and restore at the same time:
gbak -c [options] <source database> stdout | gbak -r [options] stdin <target database>
but this kept giving the error:
Done with volume #1, "new.gbak"
Press return to reopen that file, or type a new
name followed by return to open a different file.

The risk of corruption is due to the way Firebird writes it file. Your copy might contain inconsistent data when copied at the same time that Firebird (re)writes a datapage. As far as I know the only real risk of corruption is during writes of index pages (and then only for index page splits), otherwise it just leads to inconsistent data and dangling transactions (which wouldn't be visible anyway as the transaction is not committed).
If you really don't want to use a backup, you can set the Firebird database in a backup state. This freezes the database and writes the changes to a delta file instead. You enable this state using ALTER DATABASE BEGIN BACKUP and end this state using ALTER DATABASE END BACKUP. See the documentation for ALTER DATABASE. This command was added in Firebird 2.0.
For the second part of your question (which really should have been posted as a separate question):
The command gbak -c doesn't create a backup, it Creates a database from a backup (it is the brother of -r (replace) but doesn't overwrite an existing database. See Restore Switches for more information.
See Create a Database Clone Without a Dump File for an example of how you can make a backup like this:
gbak -backup emptest stdout | gbak -replace stdin emptest_2

You can do
alter database begin backup;
then copy the file using standard file copy of your OS and
alter database end backup;
Also I strongly recommend reading these pages [1] [2] about nbackup.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse