Postgresql path and LVM path issues while mapping to directory - postgresql

I have created a ec2 instance and attached 3 ebs volumes gp3=3, io1=4gb, io2=4 and mounted it.
I have installed postgres source code v 8.4.18 on it and created a database with 2 million sample entries.
The directory of the pgsql is /usr/local/pgsql/data
Now my root volume is full to 95%, I created a LVM for this 3 ebs volumes with pv create, lvcreate and vgcreate, formatted via ext4 file system and mounted to a directory /usr/local/pgsql.
Now when I try to login to postgresql by doing su - postgres and then /usr/local/pgsql/data/bin/pg_ctl -D usr/local/pgsql/data -l logfile start. It does not get start and I get an error bash: /usr/local/pgsql/data/bin/pg_ctl: No such file or directory.
And if i do cd /usr/local/pgsql/data all the data has been gone ( eg pg config files. logs. hba files etc) Also when I open /home/postgres/usr/local/pgsql/data/postgresql.conf I see blank page.
If i mount the LVM to other directories it works and i can see the conf files etc. I want to mount to the same directory so that once my root volume is full, The further sample tables i am creating should be stored in the LVM if there is no space on root vol.
Tried to check conf files, uninstalled postgresql but did not get results. I tried the same on postgresql v12, was facing same error then I just uninstalled postgres and re-installed it with it's default directory and it worked there, but not on v8.4.18 .

Related

Why postgresql docker compose up volume setting must be /var/lib/postgresql/data?

I am beginner docker user.
I installed docker and postgresql in Mac OS.
and why most of documents mention the directory
/var/lib/postgresql/data as an volume setting???
Because in my local directory, there is not existed /var/lib/postgresql..
Is it default option? or am I missing something?
Yes, correct, /var/lib/postgresql does not exist on your local computer, but it does in the created container. The volumes parameter is used to associate the local data with the container data, in order to preserve the data in case the container crashes
For example:
volumes:
- ./../database/main:/var/lib/postgresql/data
Above we link the local directory from the left side to the container directory
If you are using official PostgreSQL image from Docker Hub, then you can check the contents of its Dockerfile. E.g. here is a fragment of postgres:15 image responsible for data directory:
ENV PGDATA /var/lib/postgresql/data
# this 777 will be replaced by 700 at runtime (allows semi-arbitrary "--user" values)
RUN mkdir -p "$PGDATA" && chown -R postgres:postgres "$PGDATA" && chmod 777 "$PGDATA"
VOLUME /var/lib/postgresql/data
As you can see Postgres is configured to have data in that directory. And to persist the data even if container is stopped and removed, the volume is created. Volumes have lifetime independent of the container which allows them to "survive".

Postgres volume mounting on WSL2 and Docker desktop: Permission Denied on PGDATA folder

There are some similar posts but this is specifically related to running Postgres with WSL2 backend on Docker desktop. WSL2 brings full Linux experience on Windows. Volumes can be mounted to both Windows and Linux file systems. But the best practice is to use Linux file system for performance reasons see docker documentation.
Performance is much higher when files are bind-mounted from the Linux filesystem, rather than remoted from the Windows host. Therefore avoid docker run -v /mnt/c/users:/users (where /mnt/c is mounted from Windows).
Instead, from a Linux shell use a command like docker run -v ~/my-project:/sources where ~ is expanded by the Linux shell to $HOME.
My WSL distro is Ubuntu 20.04 LTS. I'm bind mounting Postgres data directory to a directory on Linux filesystem and I'm also configuring the Postgres PGDATA to use a sub-directory because this is instructed on the official Docker image docs:
PGDATA
This optional variable can be used to define another location - like a subdirectory - for the database files. The default is /var/lib/postgresql/data. If the data volume you're using is a filesystem mountpoint (like with GCE persistent disks) or remote folder that cannot be chowned to the postgres user (like some NFS mounts), Postgres initdb recommends a subdirectory be created to contain the data.
So this is how I start Postgres with the volume mounting to WSL2 Ubuntu file system:
docker run -d \
--name some-postgres -e POSTGRES_PASSWORD=root \
-e PGDATA=/var/lib/postgresql/data/pgdata \
-v ~/custom/mount:/var/lib/postgresql/data \
postgres
I can exec into the running container and verify that the data folder exists and it's configured correctly:
Now from the host machine (WSL2 Linux) if I try to access that folder I get the permission denied:
I would appreciate if anyone can provide a solution. None of the existing posts worked to resolve the issue.
This has got nothing to do with PostgreSQL. Docker containers run as root and so any directory created by Docker will also belong to root.
When you attach to the container and list the directory under /var/lib/postgresql/data it shows postgres as the owner.
Check "Arbitrary --user Notes" section in the official documentation here
The second option "bind-mount /etc/passwd read-only from the host" worked for me.
Two things that were blocking us working with WSL2 on Windows were:
Folder c:\Program files\WindowsApps didn't have admin account listed as owner
McAfee was blocking the WSL. In order to disable blocking we had to remove following rule: Open McAfee -> Threat Prevention -> Show Advanced (button in Right upper corner) -> scroll down to Rules -> name of the rule is "Executing Subsystem for Linux"

How to Mount Disk for Google Cloud Compute Engine to use with /home?

I have a VM Instance with a small 10GB boot disk running CentOS 7 and would like to mount a larger 200GB Persistent Disk to contain data relating to the /home directory from a previous dedicated server (likely via scp).
Here's what I tried:
Attempt #1, Symlinks Might work, but some questions.
mounted the disk to /mnt/disks/my-persistent-disk
created folders on the persistent disk that mirror the folders in the old server's /home directory.
created a symlink in the /home directory for each folder, pointing to the persistent disk.
scp from old server to the VM /home/example_account for the first account. Realized scp does not follow symlinks (oops) and therefore the files went to the boot drive instead of the disk.
I suppose I could scp to /mnt/disks/my-persistent-disk and manage the symlinks and folders. Would this pose a problem? Would making an image of the VM with this configuration carry over to new instances (with autoscaling etc)?
Attempt #2, Mounting into /home.
Looking for a more 'natural' configuration that works with ftp, scp etc, I mounted the disk in /home/example_account
$ sudo mkdir -p /home/example_account
$ sudo mount -o discard,defaults /dev/sdc /home/example_account
$ sudo chmod a+w /home/example_account
#set the UUID for mounting at startup
$ sudo blkid /dev/sdc
$ sudo nano /etc/fstab
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 10G 0 disk
└─sda1 8:1 0 10G 0 part /
sdc 8:32 0 200G 0 disk /home/example_account
scp from old server to the VM in the /home/example_account works fine. Yay. However, I would like to have more than just 1 folder in the /home directory. I suppose I could partition the disk but this feels a bit cumbersome and I'm not exactly sure how many accounts I will use in the future.
Attempt #3, Mount as /home
I felt the best solution was to have the persistent disk mount as the /home directory. This would allow for easily adding new accounts within /home without symlinks or disk partitions.
Attempted to move /home directory to /home.old but realized the Google Cloud Compute Engine would not allow it since I was logged into the system.
Changed to root user, but still said myusername#instance was logged in and using the /home directory. As root, I issued pkill -KILL -u myusername and the SSH terminated - apparently how the Google Cloud Compute Engine works with their SSH windows.
As I cannot change the /home directory, this method does not seem viable unless there is a workaround.
My thoughts:
Ideally, I think #3 is the best solution but perhaps there is something I'm missing (#4 solution) or one of the above situations is the preferable idea but perhaps with better execution.
My question:
In short, how to I move an old server's data to a Google Cloud VM with a persistent disk?

Where is the postgresql wal located? How can I specify a different path?

I want to create a database with the data files and the wal on different filesystems. I want the wal on a separate server over NFS, to avoid a loss of data in case of a fs/disk crash.
Where is the wal written?
Can I force it to a different location than the default via the configuration?
I'm on 9.1 if that matters.
Thanks.
The WAL files are written to the directory pg_xlog inside of the data directory. Starting with Postgres 10, this directory was renamed to pg_wal
E.g. /var/lib/postgresql/10/main/pg_wal
See the manual for details:
http://www.postgresql.org/docs/9.1/static/wal-configuration.html
http://www.postgresql.org/docs/current/static/wal-configuration.html
If I'm not mistaken, this directory name can not be changed. But it can be a symbolic link that points to a different disk.
As a matter of fact this is actually recommended to tune WAL performance (See here: http://wiki.postgresql.org/wiki/Installation_and_Administration_Best_practices#WAL_Directory)
To Copy the WAL Directory to another file path/disk drive, follow these steps below:
Descriptive Steps
Turn off Postgres to protect against corruption
Copy WAL directory (by default on Ubuntu - /var/lib/postgresql/<version>/main/pg_wal) to new file path using rsync. It will preserve file/folder permissions and folder structure with the -a flag. You should leave off the training slash.
Verify the contents copied correctly
Rename pg_wal to pg_wal-backup in the Postgres data directory ($PG_DATA)
Create a symbolic link to the new path to pg_wal in the Postgres data directory ($PG_DATA) and update the permissions of the symbolic link to be the postgres user
Start Postgres and verify that you can connect to the database
Optionally, delete the pg_wal-backup directory in the Postgres data directory ($PG_DATA)
Matching Commands
sudo service postgresql stop
sudo rsync -av /var/lib/postgresql/12/main/pg_wal /<new_path>
ls -la /<new_path>
sudo mv /var/lib/postgresql/12/main/pg_wal /var/lib/postgresql/12/main/pg_wal-backup
sudo ln -s /<new_path> /var/lib/postgresql/12/main/pg_wal
sudo chown -h postgres:postgres /var/lib/postgresql/12/main/pg_wal
sudo service postgresql start && sudo service postgresql status
# Verify DB connection using your db credentials/information
psql -h localhost -U postgres -p 5432
rm -rf /var/lib/postgresql/12/main/pg_wal-backup

How can I move postgresql data to another directory on Ubuntu over Amazon EC2?

We've been running postgresql 8.4 for quite some time. As with any database, we are slowly reaching our threshold for space. I added another 8 GB EBS drive and mounted it to our instance and configured it to work properly on a directory called /files
Within /files, I manually created
Correct me if I'm wrong, but I believe all postgresql data is stored in /var/lib/postgresql/8.4/main
I backed up the database and I ran sudo /etc/init.d/postgresql stop. This stops the postgresql server. I tried to copy and paste the contents of /var/lib/postgresql/8.4/main into the /files directory but that turned out be a HUGE MESS! due to file permissions. I had to go in and chmod the contents of that folder just so that I could copy and paste them. Some files did not copy fully because of root permissions. I modified the data_directory parameter in postgresql.conf to point to the files directory
data_directory = '/files/postgresql/main'
and I ran sudo /etc/init.d/postgresql restart and the server failed to start. Again probably due to permission issues. Amazon EC2 only allows you to access the service as ubuntu by default. You can only access root from within the terminal which makes everything a lot more complicated.
Is there a much cleaner and more efficient step by step way of doing this?
Stop the server.
Copy the datadir while retaining permissions - use cp -aRv.
Then (easiest, as it avoids the need to modify initscripts) just move the old datadir aside and symlink the old path to the new location.
Thanks for the accepted answer. Instead of the symlink you can also use a bind mount. That way it is independent from the file system. If you want to use a dedicated hard drive for the database you can also mount it normally. to the data directory.
I did the latter. Here are my steps if someone needs a reference. I ran this as a script on many AWS instances.
# stop postgres server
sudo service postgresql stop
# create new filesystem in empty hard drive
sudo mkfs.ext4 /dev/xvdb
# mount it
mkdir /tmp/pg
sudo mount /dev/xvdb /tmp/pg/
# copy the entire postgres home dir content
sudo cp -a /var/lib/postgresql/. /tmp/pg
# mount it to the correct directory
sudo umount /tmp/pg
sudo mount /dev/xvdb /var/lib/postgresql/
# see if it is mounted
mount | grep postgres
# add the mount point to fstab
echo "/dev/xvdb /var/lib/postgresql ext4 rw 0 0" | sudo tee -a /etc/fstab
# when database is in use, observe that the correct disk is being used
watch -d grep xvd /proc/diskstats
A clarification. It is the particular AMI that you used that sets ubuntu as the default user, this may not apply to other AMIs.
In essence if you are trying move data manually, you will probably need to do so as the root user, and then make sure its available to whatever user postgres is running with.
You also do have the option of snapshotting the volume and increasing the size of the a volume created from the snapshot. Then you could replace the volume on your instance with the new volume (You probably will have to resize the partition to take advantage of all the space).