I am trying to
create a snapshot of a PostgreSQL database (using pg_dump),
do some random tests, and
restore to the exact same state as the snapshot, and do some other random tests.
These can happen over many/different days. Also I am in a multi-user environment where I am not DB admin. In particular, I cannot create new DB.
However, when I restore db using
gunzip -c dump_file.gz | psql my_db
changes in step 2 above remain.
For example, if I make a copy of a table:
create table foo1 as (select * from foo);
and then restore, the copied table foo1 remains there.
Could some explain how can I restore to the exact same state as if step 2 never happened?
-- Update --
Following the comments #a_horse_with_no_name, I tried to to use
DROP OWNED BY my_db_user
to drop all my objects before restore, but I got an error associated with an extension that I cannot control, and my tables remain intact.
ERROR: cannot drop sequence bg_gid_seq because extension postgis_tiger_geocoder requires it
HINT: You can drop extension postgis_tiger_geocoder instead.
Any suggestions?
You have to remove everything that's there by dropping and recreating the database or something like that. pg_dump basically just makes an SQL script that, when applied, will ensure all the tables, stored procs, etc. exist and have their data. It doesn't remove anything.
You can use PostgreSQL Schemas.
Related
This question already has an answer here:
will pg_restore overwrite the existing tables?
(1 answer)
Closed 9 months ago.
I have an database on aws' rds and I use a pg_dump from a local version of the database, then psql dbname > pg_dump_file with proper arguments for remote upload to populate the database.
I'd like to know what is expected to happen if that rds db already contains data. More specifically:
Data present in the local dump, but absent in rds
Data present on rds, but absent in the local data
Data present in both but that have been modified
My current understanding:
New data will be added and be present in both after upload
Data in rds should be unaffected?
The data from the pg_dump will be present in both (assuming the same pk, but different fields otherwise)
Is that about correct? I've been reading this, but it's a little thin on how the restore is actually performed, so I'm having a harder time figuring that out. Thanks.
EDIT: following #wildplasser comment, by looking at the pg_dump file it appears that the following happens:
CREATE TABLE [....]
ALTER TABLE [setting table owner]
ALTER SEQUENCE [....]
For each table in the db. Then, again one table at a time:
COPY [tablename] (list of cols) FROM stdin;
[data to be copied]
Finally, more ALTER statements to set contraints, foreign keys etc.
So I guess the ultimate answer is "it depends". One could I suppose remove the CREATE TABLE [...], ALTER TABLE, ALTER SEQUENCE statements if those are already created as they should. I am not positive yet what happens if one tries CREATE TABLE with an existing table (error thrown perhaps?).
Then I guess the COPY statements would overwrite whatever already exists. Or perhaps throw an error. I'll have to test that. I'll write up an answer once I figure it out.
So the answer is a bit dull. Turns out that even if one removes the initial statements before the copy, if the table as an primary key (thus uniqueness constrains) then it won't work:
ERROR: duplicate key value violates unique constraint
So one gets shutdown pretty quickly there. One would have I guess to rewrite the dump as a list of UPDATE statements instead, but then I guess might as well write a script to do so. Unsure if pg_dump is all that useful in that case.
I am new to PostgreSQL. I have a database name employee (id , name, address , Phonenumber , salary). I would like to make a backup of the employee details if anyone of Phno,addres and salary is changed.
Is there any way of doing it using pg_dump or I should be satisfied with trigger method that output original Tuples onto another Table say Backup if any changes are made .
Please , if someone could elaborate in detailed manner how to get start with this using pg_dump.
pg_dump scripts out the current state of the database. That's all it does, with some fine-tuning to let you get at individual tables, schemas, etc. It does not watch for changes, it does not work at the row level (barring some zany row-level security setup), and it is not an audit log.
What you're describing -- backing up individual rows when they're modified -- is an audit log, so pg_dump is the wrong tool for the job. An update trigger which inserts the original row into an audit table is the canonical way to accomplish this, so you're on the right track there. If you need to generate scripts of the audit table, that's where pg_dump comes in.
We have many postgresql databases with the same structure using only public shcema on each one.
How can I group all of them in a single database using separate schemas?
You can dump the database definition and data out, edit the output by putting the default schema as whatever you choose and run the scripts back into database.
Remember to make the dump in SQL format, pg_dump with default custom format won't work. The schema change will only need a change on a row like
SET search_path TO *whateverschema*
If you don't want to edit the dumps (maybe they're very large), you can of course also restore them one by one to the public schema, alter the tables into the desired schema and then repeat for the next one.
There is no special way to convert an existing database into a schema in another database unfortunately.
I forgot to post the answer afer all klin comment was the answer, this step was the solution,
Inside customer_x database:
alter schema public rename to customer_x;
And then take pg_dump customer_x:
pg_dump "customer_x" --schema "customer_x" -f customer_x.sql
Inside new conglomerated database:
DROP schema customer_x CASCADE;
create schema customer_x;
Then load the dump of customer_x:
psql "conglomerated_database" -f customer_x.sql
I have backup created like this:
pg_dump dbname > file
I am trying to restore the database (after drop database and create database) like this:
psql dbname < file
What I get is a database full of tables that are created with dbname.tablename instead of just tablename.
How do I restore a postgres database making sure the tables it creates has just tablename and not dbname.tablename?
Thanks to #Craig Ringer for pointing me in the right direction.
Yes, there was SET search_path on the database for the original DB. This created the table names with schema names prefixed to table names.
Removing or commenting those out of the backup script created tables without a schema prefix. Which was desirable. But the restore didn't result in complete restore, and many tables got left out.
So did the restore, with usual means. Tables are created with schema names prefixed. The sql query scripts broke because they were not specifying the schema names every time they queried the table. To fix this, I followed this - https://stackoverflow.com/a/2875705/1945517
ALTER ROLE <your_login_role> SET search_path TO dbname;
This fixed the broken queries.
We have a large PostgreSQL dump with hundreds of tables that I can successfully import with pg_restore. We are developing a software that inserts into a lot of these tables (~100) and for every run we need to return these tables to their original state (that means to the content that was in the dump). Restoring the original dump again takes a lot of time and we just can't wait for half an hour before every debugging session. So I need a relatively fast way to revert these tables to the state they are in after restoring from the dump.
I've tried using pg_restore with -L switch and selecting these tables but I get either a duplicate key error when using both --data-only and --clean or a "cannot drop table X because other objects depend on it" error when using only --clean. Issuing a SET CONSTRAINTS ALL DEFERRED command before pg_restore did not work either. Maybe I have the rows in the table list all wrong, right now it's
491; 1259 39623998 TABLE public some_table some_user
8021; 0 0 COMMENT public TABLE some_table some_user
8022; 0 0 ACL public some_table some_user
for every table and then
6700; 0 39624062 TABLE DATA public some_table postgres
8419; 0 0 SEQUENCE SET public some_table_pk_id_seq some_user
for every table.
We only insert data and don't update existing rows so deleting all rows above an index and resetting the sequences might work, but I really don't want to have to manually create these commands for all the hundred tables and I'm not even sure it would work even if I set cascade to delete other objects depending on the given row.
Does anyone have any better idea how to handle this?
So you are looking for something like a snapshot in order to be able to revert quickly to a certain state.
I am not aware of a possiblity in PostgreSql to rollback to a certain timestamp.
While searching for a solution, I've found two ideas here
Use create database with the template option
Virtualize your PostgreSql installation using VMWare or VirtualBox, and use the snapshot feature of the virtual machines.
Again, both ideas are copied from the above source (I have search for "postgresql db snapshots").
You can use PITR to create a snapshot before loads and use the PITR snapshot to take you back to any point that you have the logs for.