How can I move postgresql data to another directory on Ubuntu over Amazon EC2? - postgresql

We've been running postgresql 8.4 for quite some time. As with any database, we are slowly reaching our threshold for space. I added another 8 GB EBS drive and mounted it to our instance and configured it to work properly on a directory called /files
Within /files, I manually created
Correct me if I'm wrong, but I believe all postgresql data is stored in /var/lib/postgresql/8.4/main
I backed up the database and I ran sudo /etc/init.d/postgresql stop. This stops the postgresql server. I tried to copy and paste the contents of /var/lib/postgresql/8.4/main into the /files directory but that turned out be a HUGE MESS! due to file permissions. I had to go in and chmod the contents of that folder just so that I could copy and paste them. Some files did not copy fully because of root permissions. I modified the data_directory parameter in postgresql.conf to point to the files directory
data_directory = '/files/postgresql/main'
and I ran sudo /etc/init.d/postgresql restart and the server failed to start. Again probably due to permission issues. Amazon EC2 only allows you to access the service as ubuntu by default. You can only access root from within the terminal which makes everything a lot more complicated.
Is there a much cleaner and more efficient step by step way of doing this?

Stop the server.
Copy the datadir while retaining permissions - use cp -aRv.
Then (easiest, as it avoids the need to modify initscripts) just move the old datadir aside and symlink the old path to the new location.

Thanks for the accepted answer. Instead of the symlink you can also use a bind mount. That way it is independent from the file system. If you want to use a dedicated hard drive for the database you can also mount it normally. to the data directory.
I did the latter. Here are my steps if someone needs a reference. I ran this as a script on many AWS instances.
# stop postgres server
sudo service postgresql stop
# create new filesystem in empty hard drive
sudo mkfs.ext4 /dev/xvdb
# mount it
mkdir /tmp/pg
sudo mount /dev/xvdb /tmp/pg/
# copy the entire postgres home dir content
sudo cp -a /var/lib/postgresql/. /tmp/pg
# mount it to the correct directory
sudo umount /tmp/pg
sudo mount /dev/xvdb /var/lib/postgresql/
# see if it is mounted
mount | grep postgres
# add the mount point to fstab
echo "/dev/xvdb /var/lib/postgresql ext4 rw 0 0" | sudo tee -a /etc/fstab
# when database is in use, observe that the correct disk is being used
watch -d grep xvd /proc/diskstats

A clarification. It is the particular AMI that you used that sets ubuntu as the default user, this may not apply to other AMIs.
In essence if you are trying move data manually, you will probably need to do so as the root user, and then make sure its available to whatever user postgres is running with.
You also do have the option of snapshotting the volume and increasing the size of the a volume created from the snapshot. Then you could replace the volume on your instance with the new volume (You probably will have to resize the partition to take advantage of all the space).

Related

Postgresql path and LVM path issues while mapping to directory

I have created a ec2 instance and attached 3 ebs volumes gp3=3, io1=4gb, io2=4 and mounted it.
I have installed postgres source code v 8.4.18 on it and created a database with 2 million sample entries.
The directory of the pgsql is /usr/local/pgsql/data
Now my root volume is full to 95%, I created a LVM for this 3 ebs volumes with pv create, lvcreate and vgcreate, formatted via ext4 file system and mounted to a directory /usr/local/pgsql.
Now when I try to login to postgresql by doing su - postgres and then /usr/local/pgsql/data/bin/pg_ctl -D usr/local/pgsql/data -l logfile start. It does not get start and I get an error bash: /usr/local/pgsql/data/bin/pg_ctl: No such file or directory.
And if i do cd /usr/local/pgsql/data all the data has been gone ( eg pg config files. logs. hba files etc) Also when I open /home/postgres/usr/local/pgsql/data/postgresql.conf I see blank page.
If i mount the LVM to other directories it works and i can see the conf files etc. I want to mount to the same directory so that once my root volume is full, The further sample tables i am creating should be stored in the LVM if there is no space on root vol.
Tried to check conf files, uninstalled postgresql but did not get results. I tried the same on postgresql v12, was facing same error then I just uninstalled postgres and re-installed it with it's default directory and it worked there, but not on v8.4.18 .

How to Mount Disk for Google Cloud Compute Engine to use with /home?

I have a VM Instance with a small 10GB boot disk running CentOS 7 and would like to mount a larger 200GB Persistent Disk to contain data relating to the /home directory from a previous dedicated server (likely via scp).
Here's what I tried:
Attempt #1, Symlinks Might work, but some questions.
mounted the disk to /mnt/disks/my-persistent-disk
created folders on the persistent disk that mirror the folders in the old server's /home directory.
created a symlink in the /home directory for each folder, pointing to the persistent disk.
scp from old server to the VM /home/example_account for the first account. Realized scp does not follow symlinks (oops) and therefore the files went to the boot drive instead of the disk.
I suppose I could scp to /mnt/disks/my-persistent-disk and manage the symlinks and folders. Would this pose a problem? Would making an image of the VM with this configuration carry over to new instances (with autoscaling etc)?
Attempt #2, Mounting into /home.
Looking for a more 'natural' configuration that works with ftp, scp etc, I mounted the disk in /home/example_account
$ sudo mkdir -p /home/example_account
$ sudo mount -o discard,defaults /dev/sdc /home/example_account
$ sudo chmod a+w /home/example_account
#set the UUID for mounting at startup
$ sudo blkid /dev/sdc
$ sudo nano /etc/fstab
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 10G 0 disk
└─sda1 8:1 0 10G 0 part /
sdc 8:32 0 200G 0 disk /home/example_account
scp from old server to the VM in the /home/example_account works fine. Yay. However, I would like to have more than just 1 folder in the /home directory. I suppose I could partition the disk but this feels a bit cumbersome and I'm not exactly sure how many accounts I will use in the future.
Attempt #3, Mount as /home
I felt the best solution was to have the persistent disk mount as the /home directory. This would allow for easily adding new accounts within /home without symlinks or disk partitions.
Attempted to move /home directory to /home.old but realized the Google Cloud Compute Engine would not allow it since I was logged into the system.
Changed to root user, but still said myusername#instance was logged in and using the /home directory. As root, I issued pkill -KILL -u myusername and the SSH terminated - apparently how the Google Cloud Compute Engine works with their SSH windows.
As I cannot change the /home directory, this method does not seem viable unless there is a workaround.
My thoughts:
Ideally, I think #3 is the best solution but perhaps there is something I'm missing (#4 solution) or one of the above situations is the preferable idea but perhaps with better execution.
My question:
In short, how to I move an old server's data to a Google Cloud VM with a persistent disk?

Docker mongodb - add database on disk to container

I am running Docker on windows and I have a database with some entries on disk at C:\data\db.
I want to add this database to my container. I have tried numerous ways to do this but failed.
I tried: docker run -p 27017:27017 -v //c/data/db:/data/db --name mongodb devops-mongodb
In my dockerfile I have:
RUN mkdir -p /data/db
VOLUME /data/db
But this doesn't add my current database on disk to the container. It creates a fresh /data/db directory and persists the data I add to it.
The docs here https://docs.docker.com/userguide/dockervolumes/ under 'Mount a host directory as a data volume' specifically told me to execute the -v //c/data/db:/data/db but this isn't working.
Any ideas?
You're using Boot2Docker (which runs inside a Virtual Machine). Boot2Docker uses VirtualBox guest additions to make directories on your Windows machine available to Docker running inside the Virtual Machine.
By default, only the C:\Users directory (on Windows), or /Users/ directory (on OS X) is shared with the virtual machine. Anything outside those directories is not shared with the Virtual Machine, which results in Docker creating an empty directory at the specified location for the volume.
To share directories outside C:\Users\ with the Virtual Machine, you have to manually configure Boot2Docker to share those. You can find the steps needed in the VirtualBox guest addition section of the README;
If some other path or share is desired, it can be mounted at run time by doing something like:
$ mount -t vboxsf -o uid=1000,gid=50 your-other-share-name /some/mount/location
It is also important to note that in the future, the plan is to have any share which is created in VirtualBox with the "automount" flag turned on be mounted during boot at the directory of the share name (ie, a share named home/jsmith would be automounted at /home/jsmith).
Please be aware that using VirtualBox guest additions have a really bad impact on performance (reading/writing to the volume will be really slow). Which could be fine for development, but should be used with caution.

Where is the postgresql wal located? How can I specify a different path?

I want to create a database with the data files and the wal on different filesystems. I want the wal on a separate server over NFS, to avoid a loss of data in case of a fs/disk crash.
Where is the wal written?
Can I force it to a different location than the default via the configuration?
I'm on 9.1 if that matters.
Thanks.
The WAL files are written to the directory pg_xlog inside of the data directory. Starting with Postgres 10, this directory was renamed to pg_wal
E.g. /var/lib/postgresql/10/main/pg_wal
See the manual for details:
http://www.postgresql.org/docs/9.1/static/wal-configuration.html
http://www.postgresql.org/docs/current/static/wal-configuration.html
If I'm not mistaken, this directory name can not be changed. But it can be a symbolic link that points to a different disk.
As a matter of fact this is actually recommended to tune WAL performance (See here: http://wiki.postgresql.org/wiki/Installation_and_Administration_Best_practices#WAL_Directory)
To Copy the WAL Directory to another file path/disk drive, follow these steps below:
Descriptive Steps
Turn off Postgres to protect against corruption
Copy WAL directory (by default on Ubuntu - /var/lib/postgresql/<version>/main/pg_wal) to new file path using rsync. It will preserve file/folder permissions and folder structure with the -a flag. You should leave off the training slash.
Verify the contents copied correctly
Rename pg_wal to pg_wal-backup in the Postgres data directory ($PG_DATA)
Create a symbolic link to the new path to pg_wal in the Postgres data directory ($PG_DATA) and update the permissions of the symbolic link to be the postgres user
Start Postgres and verify that you can connect to the database
Optionally, delete the pg_wal-backup directory in the Postgres data directory ($PG_DATA)
Matching Commands
sudo service postgresql stop
sudo rsync -av /var/lib/postgresql/12/main/pg_wal /<new_path>
ls -la /<new_path>
sudo mv /var/lib/postgresql/12/main/pg_wal /var/lib/postgresql/12/main/pg_wal-backup
sudo ln -s /<new_path> /var/lib/postgresql/12/main/pg_wal
sudo chown -h postgres:postgres /var/lib/postgresql/12/main/pg_wal
sudo service postgresql start && sudo service postgresql status
# Verify DB connection using your db credentials/information
psql -h localhost -U postgres -p 5432
rm -rf /var/lib/postgresql/12/main/pg_wal-backup

Postgresql COPY command giving Permissions denied error

I am trying to COPY a file into a table in PostgreSQL. The table owner is postgres and the file owner is postgres.
The file is in /tmp.
Still I am getting the error message:
could not open file "/tmp/file" for reading: Permission denied
I don't understand what I am doing wrong as all the posts I've found say that if I have the file in /tmp and owner is postgres then the COPY command should work.
A guess: You are using Fedora, Red Hat Enterprise Linux, CentOS, Scientific Linux, or one of the other distros that enable SELinux by default.
Either and on your particular OS/version the SELinux policies for PostgreSQL do not permit the server to read files outside the PostgreSQL data directory, or the file was created by a service covered by a targeted policy so it has a label that PostgreSQL isn't allowed to read from.
You can confirm whether or not this is the problem by running, as root:
setenforce 0
then re-testing. Run:
setenforce 1
to re-enable SELinux after testing. setenforce isn't permanent; SELinux will be automatically re-enabled on reboot anyway. Disabling SELinux permanently is not usually a good solution for issues like this; if you confirm the issue is SELinux it can be explored further.
Since you have not specified the OS or version you are using, the PostgreSQL version, the exact command you're running, ls -al on the file, \d+ on the table, etc, it's hard to give any more detail, or to know if this is more than a guess. Try updating your answer to include all that and an ls --lcontext of the file too.
COPY with a file name instructs the PostgreSQL server to directly read from or write to a file. The file must be accessible by the PostgreSQL user (the user ID the server runs as) and the name must be specified from the viewpoint of the server. (source: postgresql documentation)
So the file should be readable (or writable) by the unix user under which postgresql server is running (i.e not your user!). To be absolutly sure, you can try to run sudo -u postgres head /tmp/test.csv (assuming you are allowed to used sudo and assuming the database user is postgres).
If that fails, it might be an issue related to SELinux (as mentioned by Craig Ringer). Under the most common SELinux policy (the "targeted" reference policy), used by Red Hat/Fedora/CentOS, Scientific Linux, Debian and others... the postgresql server process is confined : it can only read/write a few file types.
The denial might not be logged in auditd's log file (/var/log/audit/audit.log) due to a donaudit rule. So the usual SELinux quick test apply e.g: stop SELinux from confining any process by running getenforce;setengorce 0;getenforce, then test postgresql's COPY. Then re-activate SELinux by running setenforce 1 (this command modify the running state, not the configuration file, so SELinux will be active (Enforcing) after reboot.
The proper way to fix that is to change the SELinux context of the file to load. A quick hack is to run:
chcon -t postgresql_tmp_t /tmp/a.csv
But this file labelling will not survive if hte filesystem is relabel or if you create a new file. You will need to create a directory with an SELinux file context mapping :
which semanage || yum install policycoreutils-python
semanage fcontext -a -t postgresql_tmp_t '/srv/psql_copydir(/.*)?'
mkdir /srv/psql_copydir
chmod 750 /srv/psql_copydir
chgrp postgres /srv/psql_copydir
restorecon -Rv /srv/psql_copydir
ls -Zd /srv/psql_copydir
Any file created in that directory should have the proper file context automatically so postgresql server can read/write it.
(to check the SELinux context under which postgres is running, runps xaZ | grep "postmaste[r]" | grep -o "[a-z_]*_t", which should print postgresql_t. To list the context types to which postgresql_t can write, use sesearch -s postgresql_t -A | grep ': file.*write'. the command sesearch belong to the setools-console RPM package).