How to restore a database runtime - postgresql

I have a test database connected to a test server. I want to run set of selenium tests and I have to restore database after every test.
I made a backup with cli command "createdb" and I just drop the main table every time, but how can I restore database without turning the whole server off and on (can't use createdb with any open connections), as it would take hours or days to make a full set of tests?
I probably won't be given constant admin access to the server, unless it's necessary.

You can kill all connections vis SQL (see https://stackoverflow.com/a/5109190/2352344). Instead of dropping the whole database you can just remove the schema:
DROP SCHEMA public CASCADE;
CREATE SCHEMA public;

I think that instead of dropping the table, how about undoing or deleting the rows in the table. When you run the test, you know what entries will be made in the Table. With this information, just before the test terminates, invoke a script to delete the rows created due to running this test.

You can use a real tool for your backup/restore (Wal-E, barman or backrest). Particularly with backrest, you can do a diff restore where it restores only files that have changes.

I solved the problem by making a bash script that i run from java code.
String[] args = new String[]{"./script.sh"};
Process proc = new ProcessBuilder(args).start();
proc.waitFor();
script.sh:
#!/bin/bash
psql dbname -c "drop schema \"public\" cascade;"
psql dbname -c "create schema \"public\";"
psql dbname < "path/backupname"
I had to use script and not just make it arguments in args, probably becouse of the "<" sign. I found no flag replacement to it.

Related

How to set default database in Postgresql database dump script?

I'd need to initialize postgres instance to Docker container from dump SQL-file. Otherwise it works fine but the problem is I cannot set database to be something else than "postgres". Creating new database works fine but schema clauses eg. CREATE TABLE end up going nowhere.
I tried to set default database with --env option in docker run command but it returns error --env requires a value.
Is there any way to set default database? Hopefully in SQL-clause.
Apparently you need to use /connect "dbname=[database name]" before schema clauses in order to point script towards correct dabase.
This wasn't (quite understandbly) included into the script when dump was generated only for a single database instead of the whole cluster.

I have loaded wrong psql dump into my database, anyway to revert?

Ok, I screwed up.
I dumped one of my psql (9.6.18) staging database with the following command
pg_dump -U postgres -d <dbname> > db.out
And after doing some testing, I "restored" the data using the following command.
psql -f db.out postgres
Notice the absence of -d option? yup. And that was supposed to be the username.
Annnd as the database happend to have the same name as its user, it overwrote the 'default' database (postgres), which had data that other QAs are using.
I cancelled the operation quickly as soon as I realised my mistake, but the damage was still done. Around 1/3 ~ 1/2 of the database is roughly identical to the staging database - at least in terms of the schema.
Is there any way to revert this? I am still looking for any other dumps if any of these guys made one. But I don't think there is any past two to three months. Seems like I got no choice but to own up and apologise to them in the morning.
Without a recent dump or some sort of PITR replication setup, you can't un-revert this easily. The only option is to manually go through the log of what was restored and remove/alter it in the postgres database. This will work for the schema, the data is another matter. FYI, the postgres database should not really be used as a 'working' database. It is there to be a database to connect to for doing other operations, such as CREATE DATABASE or to bootstrap your way into a cluster. If left empty then the above would not have been a problem. You could have done, from another database, DROP DATABASE postgres; and then CREATE DATABASE postgres.
Do you have a capture of the output of the psql -f db.out postgres run?
Since the pg_dump didn't specify --clean or -c, it should not have overwritten anything, just appended. And if your tables have unique or primary keys, most of the data copy operations should have failed with unique key violations and rolled back. Even one overlapping row (per table) would roll back the entire dataset for that table.
Without having the output, it will be hard to figure out what damage has actually been done.
You should also immediately copy the pg_xlog data someplace safe. If it comes down to it, you might be able to use pg_xlogdump to figure out what changes committed and what did not.

Copy an entire data table (not entire database) from local machine to heroku postgres

I have a relatively large data table (~4m rows) that has been imported to a locally hosted postgresql database. (As it happens it's a ruby on rails app database, but that shouldn't be important for the purposes of the question - unless it helps)
I want to take that table and add it into an identical table in a heroku postgresql database (the table is currently empty).
How would I do that quickly and efficiently?
I found this Copy a table from one database to another in Postgres
but I'm struggling with the syntax for the heroku end, i.e. how do I connect to both at the same time? Which database am I connecting to originally?
In that answer, you are originally connected to the database "source_db" or "my_db" (depending on which line in the answer you are looking at). Presumably that database is on the instance running locally on port 5432, unless unshown environment variables (or non-default compilation) have changed that. And the destination database is named "target_db", running in the same instance.
The pg_dump and psql are independent commands and each takes all the connection options that they would take if run in isolation. So you would probably want something like:
pg_dump -t table_to_copy source_db | psql target_db -h you.heroku.hostname_or_ip
A problem could be if both commands prompt for a password, it might make a mess. Which password do you need to enter first? And whichever order, will they read them correctly? If both need passwords, it is best to arrange that at least one of them be supplied by ~/.pgpass.

Managing foreign keys when using pg_restore with multiple dumps

I have a bit of a weird issue. We were trying to create a database baseline for our local environment that has very specific data pre-seeded into it. Our hopes were to make sure that everyone was operating with the same data, making collaboration and reviewing code a bit simpler.
My idea for this was to run a command to dump the database whenever we run a migration or decide a new account is necessary for local dev. The issue with this is the database dump is around 17MB. I'm trying to avoid us having to add a 17MB file to GitHub every time we update the database.
So the best solution I could think of was to setup a script to dump each individual table in the database. This way, if a single table is updated, we'd only be pushing that backup to GitHub and it would be more along a ~200kb file as opposed to 17mb.
The main issue I'm running into with this is trying to restore the database. With a full dump, handling the foreign keys is relatively simple as it's all done in a single restore command. But with multiple restores, it gets a bit more complicated.
I'm looking to find a way to restore all tables to a database, ignoring triggers and constraints, and then enabling them again once the data has been populated. (or find a way to export the tables based on the order the foreign keys are defined). There are a lot of tables to work with, so doing this manually would be a bit of a task.
I'm also concerned about the relational integrity of the database if I disabled/re-enable constraints. Any help or advice would be appreciated.
Right now I'm running the following on every single table:
pg_dump postgres://user:password#pg:5432/database -t table_name -Fc -Z9 -f /data/www/database/data/table_name.bak
And then this command to restore all backups to the DB.
$data_command = "pg_restore --disable-triggers -d $dbUrl -Fc \"%s\"";
$backups = glob("$directory*.bak");
foreach($backups as $data_file){
if($data_file != 'data_roles.bak') {
exec(sprintf($data_command, $data_file));
}
}
This obviously doesn't work as I hit a ton of "Relationship doesn't exist" errors. I guess I'm just looking for a better way to accomplish this.
I would separate the table data and the database metadata.
Create a pre- and post-data scfipt with
pg_dump --section=pre-data -f pre.sql mydb
pg_dump --section=post-data -f post.sql mydb
Then dump just the data for each table:
pg_dump --section=data --table=tab1 -f tab1.sql mydb
To restore the database, first restore pre.sql, then all the table data, then post.sql.
The pre- and post-data will change often, but they are not large, so that shouldn't be a problem.

mysqldump by query, then use to update remote database

I have a database containing a very large table including binary data which I want to update on a remote machine, once a day. Rather than dumping the entire table, transferring and recreating it on the remote machine, I want to dump only the updates, then use that dump to update the remote machine.
I already understand that I can produce the dump file as such.
mysqldump -u user -p=pass --quick --result-file=dump_file \
--where "Updated >= one_day_ago" \
database_name table_name
1) Does the resulting "restore" on the remote machine
mysql -u user -p=pass database_name < dump_file
only update the necessary rows? Or are there other adverse effects?
2) Is this the best way to do this? (I am unable to pipe to the server directly and using --host option)
If you only dump rows where the WHERE clause is true, then you will only get a .sql file that contains the values you want to update. So you will never get duplicate values if you use the current export options. However, inserting these into a database will not work. You will have to use the commandline parameter --replace, otherwise, if you dump your database and a row with id 6 in table table1 and try to import this into your other database, you'll get an error on duplicates if a row already has id = 6. Using the --replace parameter, it will overwrite older values, which can only happen if there is a new one (according to your WHERE clause).
So to quickly answer:
Yes, this will restore on the remote machine, but only if you saved using --replace (this will restore the latest version of the file you have)
I am not entirely sure if you can pipe backups. According to this website, you can, but I have never tried it before.