pg_basebackup fails with message: could not create directory - postgresql

I'm trying to create a hot standby server using PostgreSQL 9.3.5 and Red Hat 6.5
I receive the folowing error when running pg_basebackup:
$ pg_basebackup -h 172.28.250.10 -D /var/lib/pgsql/9.3/data -U replicador -v -P
pg_basebackup: could not create directory "/var/lib/pgsql/9.3/data/osm_indices":
File exists
/var/lib/pgsql/9.3/data exists and is empty when I launch the tool and when it fails there is data at /var/lib/pgsql/9.3/data/osm_indices. The DB has 5 tablespaces and 4 are completely copied.
Both servers are running the same O.S. and DB server version.
I've tried the same with 2 different masters and 3 slaves with the same result, but not always is the same tablespaces that fails to copy.
Thanks,
Luis.

It looks like you might have tablespaces inside the data directory.
You should not do that. Tablespaces are meant to be separate paths, and some of the tools assume that they will be.
Move the tablespaces outside the datadir and pg_basebackup should behave, so long as you have corresponding paths on the destination server.

Related

database backup in jelastic can't be done from the app node

My Goal is to have an automatic database backup that will be sent to my s3 backet
Jelastic has a good documentation how to run the pg_dump inside the database node/container, but in order to obtain the backup file you have to do it manually using an FTP add-ons!
But As I said earlier my goal is to send the backup file automatically to my s3 backet, what I tried to do is to run the pg_dump from my app node instead of postgresql node (hopefully I can have some control from the app side), the command I run basically looks like this:
PGPASSWORD="my_database_password" pg_dump --host "nodeXXXX-XXXXX.jelastic.XXXXX.net"
-U my_db_username -p "5432" -f sql_backup.sql "database_name" 2> $LOG_FILE
The output of my log file is :
pg_dump: server version: 10.3; pg_dump version: 9.4.10
pg_dump: aborting because of server version mismatch
The issue here is that the database node has a different pg_dump version than the nginx/app node, so the backup can't be performed! I looked around but can't find an easy way to solve this. Am open to any alternative way that helps to achieve my initial goal.

Backup and restore postgresql data folder directly

Till now I've been backing up my postgresql data using pg_dump, which exports the data to an sql file mydb.sql, and then restoring from that sql file using psql -U user -d db < mydb.sql.
For one reason or another it would be more convenient to restore the database content more directly, in an environment where psql does not exist... specifically on a host server where postgresql is installed in a docker container running on the host, but not on the host itself.
My plan is to back up the content of /var/lib/postgresql/data/ to a tar file, and when required (e.g. when a new server is created that hosts the postgresql container) just restore that to the same path. The folder /var/lib/postgresql/data/ in the docker container is mapped to a folder on the host server, so I would create this backup on the host, not inside the postgres container.
Is this a valid approach? Any "gotchas"? And are there any subfolders within /var/lib/postgresql/data/ that I can exclude from the tar file? I don't want to back up mere 'housekeeping' information.
You can do that, but you have to do it properly if you don't want your database to become corrupted.
Either stop PostgreSQL before copying the data directory or follow the instructions from the documentation for an online backup.

Migrating 200GB of Postgres data from 9.0 to 9.6

We have a simple database with just 5 tables. But 1 table is huge, around 100GB of data by itself, and the indices together are nearly double that size. The server is an old CentOS 5 server with PG 9.0. I'm moving to a more modern setup with SSD hard disks, CentOS 7, and PG 9.6.
Question: what's the best way to migrate data in a simple way. pg_dump it on the old server, move it via rsync or something to the new server and pg_restore? I could do the pg_dump with -Fc option, so that we can pg_restore it easily (otherwise it's a text format and we have to use psql -f instead). But a trial run suggested that while the pg_dump is OK, the pg_restore on the destination server, which is much faster, goes on and on. We did a pg_restore --verbose, but there was no verbosity at all. Perhaps the server was stuck doing IO?
Our pg.conf settings for the pg_restore are as follows:
maintenance_work_mem = 1500MB
fsync = off
synchronous_commit = off
wal_level = minimal
full_page_writes = off
wal_buffers = 64MB
max_wal_senders = 0
wal_keep_segments = 0
archive_mode = off
autovacuum = off
What should we do to ensure that the pg_restore works? Right now both servers are offline, so I can do pretty much anything needed -- any settings can be changed.
Some more background info--
Old server: CentOS 5, SCSI RAID 1 disks, 4GB RAM (not much), PG 9.0
New server: CentOS 7 (latest), SSD disk, 16GB RAM, PG 9.6
Thank you for any pointers on moving large tables in the best way possible. The usual PG documentation doesn't seem to be helping. We've tried both the text dump way and the -Fc way.
I strongly suggest you pg_upgrade:
Install 9.0.23 on the new server. From source if necessary.
Set up a streaming replica on the new server using pg_basebackup and a suitable recovery.conf. Enable WAL archiving and restore_command too, in case it becomes desynchronised for any reason.
Also install 9.6 on the new server
Do an upgrade test by stopping the replica and attempting a pg_upgrade to 9.6. Restart the replica, fix any issues and repeat until you succeed.
When you're confident pg_upgrade will succeed, plan a cut-over time. Stop the 9.0 master and stop the replica. pg_upgrade the replica. Start the new 9.6 server.
See the pg_upgrade documentation for more info.
Remember: KEEP BACKUPS.
If you want simple, just pg_dumpall and then pipe to psql. But that'll be slow and it'll cause problems if your restore fails partway through then you try to resume, etc.
Better:
If you don't want to use replication, then use parallel-mode pg_dump and pg_restore with directory format input/output if you want to get things done quickly.
Configure your 9.0 database to accept connections from the 9.6 host and make sure there's a high-performance network connection (gigabit or better).
Using the 9.6 host, running the 9.6 versions of pg_dump and pg_dumpall:
Dump your global objects with pg_dumpall --globals-only -f globals.sql
Dump your database(s) with pg_dump -Fd -j4 -d dbname -f dbname.dumpdir or similar. -j is the number of parallel jobs. You'll need to dump each database separately if there are multiple ones.
Cleanly initdb a new PostgreSQL 9.6 install, removing whatever attempts you have previously made (since I don't know what is/isn't there). Alternately, DROP any created roles, databases, etc, returning it to a clean state.
Use psql to run the globals script: psql -v ON_ERROR_STOP=1 --single-transaction -f globals.sql -d postgres
Use pg_restore to load the database dumps: pg_restore --create -d template1 -j4 template1 dbname.dump, repeating for each dumped DB. You can restore multiple DBs concurrently.
Yes, I know the handling of global objects sucks. And yes, it'd be nice if all this were wrapped up in a simple command. But it isn't. Designs and well thought out patches are welcome if you want to try to improve this. So far nobody's wanted to enough to do the work.

Postgres Recovery Failure

What I am trying to accomplish is a recovery using a continuous archive backup.
I am running a vm of CentOS 6.8 and Postgres 9.1 Postgres 9.1 is the same as the DB that I am pulling from.
I installed Postgres and initialized the DB, started up fine.
Then, following these directions: https://www.postgresql.org/docs/9.3/static/continuous-archiving.html
Stopped the destination pSQL server (as root: service postgresql-9.1 stop)
Copied the destination cluster data folder to the side (as postgres)
Removed the cluster data files (as postgres)
Copied in my source data folder (as postgres)
Copied WAL files into a clean pg_xlog folder under the data folder (as postgres)
Created a recovery.conf file which contained:
restore_command = 'cp /var/lib/pgsql/database_sample_backup/wal_archives/0A/%f %p'
This being another location for the WAL files other than the copy I placed in pg_xlog (was not sure if I needed both)
But when I attempt to restart my server, it fails. (as root: service postgresql-9.1 start)
My pgstartup.log at one point spit out "runuser: cannot set groups: Operation not permitted" but it doesn't consistently do this with every attempt to start.
I've also tried turning off archiving and replication directive in postgres.conf (so that it can run stand alone) and tried copying over the pg_hba.conf from the new DB I had created to see if they would resolve the issue. Neither did.
I've also done a netstat -ntap | grep 5432 which confirmed that I don't have anything else running on the port.
What else can I provide in the form of details, and what else my I attempt in this restoration process.
Thank you for your help!

How to migrate data to remote server with PostgreSQL?

How can I dump my database schema and data in such a way that the usernames, database names and the schema names of the dumped data matches these variables on the servers I deploy to?
My current process entails moving the data in two steps. First, I dump the schema of the database (pg_dump --schema-only -C -c) then I dump out the data with pg_dump --data-only -C and restore these on the remote server in tandem using the psql command. But there has to be a better way than this.
We use the following to replicate databases.
pg_basebackup -x -P -D /var/lib/pgsql/9.2/data -h OTHER_DB_IP_ADDR -U postgres
It requires the "master" server at OTHER_DB_IP_ADDR to be running the replication service and pg_hba.conf must allow replication connections. You do not have to run the "slave" service as a hot/warm stand by in order to replicate. One downside of this method compared with a dump/restore, the restore operation effectively vacuums and re-indexes and resets EVERYTHING, while the replication doesn't, so replicating can use a bit more disk space if your database has been heavily edited. On the other hand, replicating is MUCH faster (15 mins vs 3 hours in our case) since indexes do not have to be rebuilt.
Some useful references:
http://opensourcedbms.com/dbms/setup-replication-with-postgres-9-2-on-centos-6redhat-el6fedora/
http://www.postgresql.org/docs/9.2/static/high-availability.html
http://www.rassoc.com/gregr/weblog/2013/02/16/zero-to-postgresql-streaming-replication-in-10-mins/