pg_dump vs pg_dumpall? which one to use to database backups? - postgresql

I tried pg_dump and then on a separate machine I tried to import the sql and populate the database, I see
CREATE TABLE
ERROR: role "prod" does not exist
CREATE TABLE
ERROR: role "prod" does not exist
CREATE TABLE
ERROR: role "prod" does not exist
CREATE TABLE
ERROR: role "prod" does not exist
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
WARNING: no privileges could be revoked for "public"
REVOKE
ERROR: role "postgres" does not exist
ERROR: role "postgres" does not exist
WARNING: no privileges were granted for "public"
GRANT
which means my user and roles and grant information is not in pg_dump
On the other hand we have pg_dumpall, I read conversation, and this does not lead me anywhere?
Question
- Which one should I be using for database backups? pg_dump or pg_dumpall?
- the requirement is that I can take the backup and should be able to import to any machine and it should work just fine.

The usual process is:
pg_dumpall --globals-only to get users/roles/etc
pg_dump -Fc for each database to get a nice compressed dump suitable for use with pg_restore.
Yes, this kind of sucks. I'd really like to teach pg_dump to embed pg_dumpall output into -Fc dumps, but right now unfortunately it doesn't know how so you have to do it yourself.
Up until PostgreSQL 11 there was also a nasty caveat with this approach: Neither pg_dump, nor pg_dumpall in --globals-only mode would dump user access GRANTs on DATABASEs. So you pretty much had to extract them from the catalogs or filter a pg_dumpall. This is fixed in PostgreSQL 11; see the release notes.
Make pg_dump dump the properties of a database, not just its contents (Haribabu Kommi)
Previously, attributes of the database itself, such as database-level GRANT/REVOKE permissions and ALTER DATABASE SET variable settings, were only dumped by pg_dumpall. Now pg_dump --create and pg_restore --create will restore these database properties in addition to the objects within the database. pg_dumpall -g now only dumps role- and tablespace-related attributes. pg_dumpall's complete output (without -g) is unchanged.
You should also know about physical backups - pg_basebackup, PgBarman and WAL archiving, PITR, etc. These offer much "finer grained" recovery, down to the minute or individual transaction. The downside is that they take up more space, are only restoreable to the same PostgreSQL version on the same platform, and back up all tables in all databases with no ability to exclude anything.

Related

pg_dump and friends: backup and restore using tablespace

Given a database dump how to specify a tablespace to be used by all tables during restore? The database has multiple tablespaces used by its table. Old tablespaces should be ignored (they are not relevant on new computer) and all tablespaces my by replaced by a new one.
dump with "--no-tablespaces" parameter to have tablespaces-free dump - but you can also use the same parameter on pg_restore if you cannot change dump commands
set global parameter "default_tablespace" on target DB to what is needed for restore (for example by using alter database xxxxx set DEFAULT_TABLESPACE='xxx')
run all pg_restore tasks
if necessary reset default_tablespace to original value

How do I run and automate making a database with PSQL

I'm linking PSQL to a djangoREST api and I want it all to be automated, however I'm having trouble automating the creation of a database. Currently in my code I have this
initdb /usr/local/var/postgres, and while this creates a database I don't know how to change the owner and db name like you would when you would normally run
CREATE DATABASE <databasename>
I'm not sure if I understood your question fully. Two actions(rename/change owenr), you could do following way.
Rename database to new_databasename
ALTER DATABASE <databasename> RENAME TO <new_databasename>;
Change database owner to new_user
ALTER DATABASE <databasename> OWNER TO <new_databasename>;

pg_restore --clean is not dropping and clearing the database

I am having an issue with pg_restore --clean not clearing the database.
Or do I misunderstand what the --clean does, I am expecting it to truncate the database tables and reinitialize the indexes/primary keys.
I am using 9.5 on rds
This is the full command we use
pg_restore --verbose --clean --no-acl --no-owner -h localhost -U superuser -d mydatabase backup.dump
Basically what is happening is this.
I do a nightly backup of my production db, and restore it to an analytics db for the analyst to churn and run their reports.
I found out recently that the rails application used to view the reports was complaining that the primary keys were missing from the restored analytics database.
So I started investigating the production db, the analytics db etc. Which was when I realized that multiple rows with the same primary key existed in the analytics database.
I ran a few short experiments and realized that every time the pg_restore script is run it inserts duplicate data into the tables, this leads me to think that the --clean is not dropping and restoring the data. Because if I were to drop the schema beforehand, I don't get duplicate data.
To remove all tables from a database (but keep the database itself), you have two options.
Option 1: Drop the entire schema
You will need to re-create the schema and its permissions. This is usually good enough for development machines only.
DROP SCHEMA public CASCADE;
CREATE SCHEMA public;
GRANT ALL ON SCHEMA public TO postgres;
GRANT ALL ON SCHEMA public TO public;
Applications usually use the "public" schema. You may encounter other schema names when working with a (legacy) application's database.
Note that for Rails applications, dropping and recreating the database itself is usually fine in development. You can use bin/rake db:drop db:create for that.
Option 2: Drop each table individually
Prefer this for production or staging servers. Permissions may be managed by your operations team, and you do not want to be the one who messed up permissions on a shared database cluster.
The following SQL code will find all table names and execute a DROP TABLE statement for each.
DO $$ DECLARE
r RECORD;
BEGIN
FOR r IN (SELECT tablename FROM pg_tables WHERE schemaname = current_schema()) LOOP
EXECUTE 'DROP TABLE IF EXISTS ' || quote_ident(r.tablename) || ' CASCADE'; -- DROP TABLE IF EXISTS instead DROP TABLE - thanks for the clarification Yaroslav Schekin
END LOOP;
END $$;
Original:
https://makandracards.com/makandra/62111-how-to-drop-all-tables-in-postgresql

Postgres generate user grant statements for all objects

We have a dev Postgres DB that one of the developers has created an application in. Is there an existing query that will pull information from the role_table_grants table and generate all the correct statements to move into production? PGAdmin will create all the generate scripts for certain things but I haven't found a less manual way rather than just writing all the statements by hand based on the role_table_grants table. Not asking anyone to dump time into creating it, just thought I would ask if there are some existing migration scripts out there that would help.
Thanks.
Dump the schema to a file; use pg_dump or pg_dumpall with the --schema-only option.
Then use grep to get all the GRANT and REVOKE statements.
On my dev machine, I might do something like this.
$ pg_dump -h localhost -p 5435 -U postgres --schema-only sandbox > sandbox.sql
$ grep "^GRANT\|^REVOKE" sandbox.sql
REVOKE ALL ON SCHEMA public FROM PUBLIC;
REVOKE ALL ON SCHEMA public FROM postgres;
GRANT ALL ON SCHEMA public TO postgres;
[snip]
Perhaps pg_dumpall is what you need. Probably with --schema-only option in order to dump just schema, not development data.
If you need to move not all databases, you can use pg_dumpall --globals-only to dump roles (which don't belong to any particular database), and then use pg_dump to dump one certain databases.

Database named "postgres"

I've just set up Postgres for use by different users on my network. Every user has his own username/password/database, but when I connect to Pg I can also see a 'postgres' database (and even create tables etc). I tried to REVOKE access to that database from public but then it won't let me connect. What exactly is the postgres database and why is it needed? Can I disable it so that users only see the database(s) I've created for them?
The postgres database is created by default when you run initdb.
Quote from the manual:
Creating a database cluster consists of creating the directories in which the database data will live (...) creating the template1 and postgres databases. When you later create a new database, everything in the template1 database is copied. (...) The postgres database is a default database meant for use by users, utilities and third party applications.
There is nothing special about it, and if you don't need it, you can drop it:
drop database postgres;
You need to do that as a superuser of course. The only downside of this is that when you run psql as the postgres operating system user, you need to explicitly provide a database name to connect to
If you drop the postgres database you'll find a few things to be confusing. Most tools default to using it as the default database to connect to, for one thing. Also, anything run under the postgres user will by default expect to connect to the postgres database.
Rather than dropping it, REVOKE the default connect right to it.
REVOKE connect ON DATABASE postgres FROM public;
The superuser (usually postgres), and any users you explicitly grant rights to access the database can still use it as a convenience DB to connect to. But others can't.
To grant connect rights to a user, simply:
GRANT connect ON DATABASE postgres TO myuser;