Postgres with Docker: Postgres fails to load when persisting data

Postgres with Docker: Postgres fails to load when persisting data - postgresql

I'm new to Postgres.
I updated the Dockerfile I use and successfully installed Postgresql on it. (My image runs Ubuntu 16.04 and I'm using Postgres 9.6.)
Everything worked fine until I tried to move the database to a Volume with docker-compose (that was after making a copy of the container's folder with cp -R /var/lib/postgresql /somevolume/.)
The issue is that Postgres just keeps crashing, as witnessed by supervisord:
2017-07-26 18:55:38,346 INFO exited: postgresql (exit status 1; not expected)
2017-07-26 18:55:39,355 INFO spawned: 'postgresql' with pid 195
2017-07-26 18:55:40,430 INFO success: postgresql entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-07-26 18:55:40,763 INFO exited: postgresql (exit status 1; not expected)
2017-07-26 18:55:41,767 INFO spawned: 'postgresql' with pid 197
2017-07-26 18:55:42,841 INFO success: postgresql entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-07-26 18:55:43,179 INFO exited: postgresql (exit status 1; not expected)
(and so on…)
Logs
It's not clear to me what's happening as /var/log/postgresql remains empty.
chown?
I suspect it has to do with the user. If I compare the data folder inside the container and the copy I made of it to the volume, the only difference is that the original is owned by postgres while the copy is owned by root.
I tried running chown -R postgres:postgres on the copy. The operation was performed successfully, however postmaster.pid remains owned by root and I think that would be the issue.
Questions
How can I get more information about the cause of the crash?
How can I make it so that postmaster.id be owned by postgres ?
Should I consider running postgres with root instead?
Any hint welcome.
EDIT: links to the Dockerfile and the docker-compose.xml.

I'll answer my own question:
Logs & errors
What made matters more complicated was that I was not getting any specific error message.
To change that, I disabled the [program:postgresql] section in supervisord and, instead, started postgres manually from the command-line (thanks to Miguel Marques for setting me on the right track with his comment.)
Then I finally got some useful error messages:
2017-08-02 08:27:09.134 UTC [37] LOG: could not open temporary statistics file "/var/run/postgresql/9.6-main.pg_stat_tmp/global.tmp": No such file or directory
Fixing the configuration
I fixed the error above with this, eventually adding them to my Dockerfile:
mkdir -p /var/run/postgresql/9.6-main.pg_stat_tmp
chown postgres.postgres /var/run/postgresql/9.6-main.pg_stat_tmp -R
(Kudos to this guy for the fix.)
To make the data permanent, I also had to do this, for the volume to be accessible by postgres:
mkdir -p /var/lib/postgresql/9.6/main
chmod 700 /var/lib/postgresql/9.6/main
I also used initdb to initialize the data directory. BEWARE! This will erase any data found in that folder. Like so:
rm -R /var/lib/postgresql/9.6/main/*
ls /var/lib/postgresql/9.6/main/
/usr/lib/postgresql/9.6/bin/initdb -D /var/lib/postgresql/9.6/main
Testing
After the above, I could finally run postgres properly. I used this command to run it and test from the command-line:
su postgres
/usr/lib/postgresql/9.6/bin/postgres -D /var/lib/postgresql/9.6/main -c config_file=/etc/postgresql/9.6/main/postgresql.conf # as per the Docker docs
To test, I kept it running and then, from another prompt, checked everything ran fine with this:
su postgres
psql
CREATE TABLE cities ( name varchar(80), location point );
INSERT INTO cities VALUES ('San Francisco', '(-194.0, 53.0)');
select * from cities; # repeat this command after restarting the container to check that the data does persist
…making sure to restart the container and test again to check the data did persist.
And then finally restored the [program:postgresql] section in supervisord, rebuilt the image and restarted the container, making sure everything ran fine (in particular supervisord: tail /var/log/supervisor/supervisord.log), which it did.
(The command I used inside of supervisord.conf is also /usr/lib/postgresql/9.6/bin/postgres -D /var/lib/postgresql/9.6/main -c config_file=/etc/postgresql/9.6/main/postgresql.conf, as per this Docker article and other postgres+supervisord examples. Other options would have been using pg_ctl or an init.d script, but it's not clear to me why/when one would use those.)
I spent a lot of time on this. Hopefully the detailed answer will help someone down the line.
P.S.: I did end up producing a minimal example of my issue. If that can help anyone, here they are: Dockerfile, supervisord.conf and docker-compose.yml.

I do not know if this would be another way to achieve the same result (I'm new on Docker and Postgres too), but have you try the oficial repository image for Postgres (https://hub.docker.com/_/postgres/)?
I'm getting the data out of the container setting the environment variable PGDATA to '/var/lib/postgresql/data/pgdata' and binding this to an external volume on the run command:
docker run --name bd_TEST --network=my_network --restart=always -e POSTGRES_USER="superuser" -e POSTGRES_PASSWORD="myawesomepass" -e PGDATA="/var/lib/postgresql/data/pgdata" -v /var/local/db_data:/var/lib/postgresql/data/pgdata -itd -p 5432:5432 postgres:9.6
When the volume is empty, all the files are created by the image startup script, and if they already exist, the database start to used it.

From past experience I can see what may be a problem. I can't say if this will help but it is worth a try.
I would have added this as a comment, but I can't because my rep isn't hight enough.
I've spied a couple problems with how you have structured your statements in your Dockerfile. You have installed various things multiple times and also updated sporadically through the code. In my own files i've noticed that this can lead to somewhat random behaviour of my services and installation because of the different layers.
This may not seem to solve your problem directly, but cleaning up your file as is outlined in the best practices has solved many Dockerfile problems for me in the past.
One of the first places upon finding such problems is to start here at the best practices for RUN. This has helped me solve tricky problems in the past and I hope it'll solve or at least make it easier.
Pay special attention to this part:
After building the image, all layers are in the Docker cache. Suppose you later modify apt-get install by adding extra package:
FROM ubuntu:14.04
RUN apt-get update
RUN apt-get install -y curl nginx
Docker sees the initial and modified instructions as identical and reuses the cache from previous
steps. As a result the apt-get update is NOT executed because the
build uses the cached version. Because the apt-get update is not run,
your build can potentially get an outdated version of the curl and
nginx packages.
After reading this I would start by consolidating all your dependencies.

In my case, having the same error, I debugged it until I found out:
the disk was full and I increased the diskspace to solve this.
(stupid error, easy fix - maybe reading this here helps someone not wasting time)
also linking this questiong for other options:
Supervisord "exit status 1 not expected" running php script
https://serverfault.com/questions/537773/supervisor-process-exits-with-exit-status-1-not-expected/1076115#1076115

Related

PostgreSQL server fails to start on ArchLinux: FATAL: could not create lockfile »/run/postgresql/.s.PGSQL.5432.lock«

I am quite new in Arch and a total beginner in PostgreSQL, so this may be a very basic question.
I installed postgresql 11.5-4 from extra and pgadmin 4 from AUR, both seem to be running well.
I created a test DB with the following command:
initdb -D /home/lg/test-db
I got the answer:
You can start the db-server using:
pg_ctl -D /home/lg/test-db -l logdatei start
I tried that and got:
pg_ctl -D /home/lg/test-db -l logdatei start
waiting for serer to start.... stopped
pg_ctl: could not start the server
check the log.
The log only says that the lockfile »/run/postgresql/.s.PGSQL.5432.lock« could not be created, because the folder could not be found. Under /run is no folder called "postgresql". I suppose postgresql can not create this folder, because it does not have the permission. Several posts online posts suggest to change the user/owner of the db to sudo, however. Postgresql prevents this, however. When I try any command as sudo, postgresql tells me that this command can't be run as root. There must be some very basic error in my thinking here, but I have not worked it out for 3 hours.

You'll have to remove /run/postgresql from unix_socket_directories in postgresql.conf before starting the server.

Probably You have /var/run symlinked to /run and run is on tmpfs. You should add something like d /run/postgresql 0755 postgres postgres - into /usr/lib/tmpfiles.d/postgresql.conf

List directory /var/lib/apt/lists/partial is missing. - Acquire (20: Not a directory)

When I executed sudo apt update I'm getting
Reading package lists... Done
E: List directory /var/lib/apt/lists/partial is missing. - Acquire (20: Not a directory)
Also, I was getting a status error which I solved using
sudo cp /var/lib/dpkg/status-old /var/lib/dpkg/status
I tried sudo mkdir /var/lib/apt/lists/partial as suggested in few other threads
mkdir: cannot create directory ‘/var/lib/apt/lists/partial’: Not a directory
Even I tried sudo mkdir /var/lib/apt/lists/
Any other solution?

The answer may be inappropriate here. But as I came here others may land here too.
If you're using docker and you face the same issue you can do like the following.
USER root
# RUN commands
USER 1001
Reference: Link

You can try adding -u 0 in the command
sudo docker exec -u 0 -it ContainerID bin/bash
According to Docker, the u flag defines what username or UID in the system for the container to run as, setting -u 0 means you run the container as root, use it with caution! Reference here

The same happened to me. I follow as guide this answer:The package lists or status file could not be parsed or opened
I assumed my lists were corrupted. I went to /var/lib/apt/ I saw a file (lists#) instead of a directory. I deleted it (sudo rm lists) and re-created the path (sudo mkdir -p /var/lib/apt/lists/partial). Double-check the path gets created.

I ran into the same issue while trying to build a new container and experimenting with Dockerfile for a while.
What saved me finally was just to delete all containers I've created during this process using docker rm.

I had this same issue when trying to install an Typora on Ubuntu 20.04.
I was running into the error whenever I run the command below:
# add Typora's repository
sudo add-apt-repository 'deb https://typora.io/linux ./'
Here's how I solved it:
I disconnected and reconnected my network connection, and when I ran the command again, it worked fine.
I think it was an issue with my network connectivity.
That's all.
I hope this helps

I had a similar error when using bitnami spark image and docker exec command with arguments -u didn't work for me. I found my answer in the image documentation here.
In case you are using a docker image, it might be that the image is a non root container image. Read the documents of the docker image provider to find the solution to see how you can use the image as a root container image.

this is how it works access as root in docker bash and install your apps
get id container by name
sudo docker ps -aqf "name=name=es01"
access bash as root
sudo docker exec -u 0 -it 3d42134dfd59 bash
Example install:
apt get update
apt-get install nano

You first need to have super user privilege by typing in sudo -i and then inserting your password.

FATAL ERROR lock file "postmaster.pid" already exists

I have recently installed PostGIS on my Mac (El Capitan 10.11.4, Postgres is version 9.5.1) using Homebrew, and I am following these instructions - http://morphocode.com/how-to-install-postgis-on-mac-os-x/
When I try to start Postgres using
pg_ctl -D /usr/local/var/postgres start
I get the following error:
$ FATAL: lock file "postmaster.pid" already exists
HINT: Is another postmaster (PID 280) running in data directory "/usr/local/var/postgres"?
So I spent a few hours researching how to address this, but to no avail.
Notably, I tried to kill the PID as recommended in an answer on Superuser - https://superuser.com/questions/553045/fatal-lock-file-postmaster-pid-already-exists- (in the case above, I ran kill 208), but as soon as I tried to start Postgres again, I got the same error, albeit with a different PID number. I saw a few people recommended deleting the postmaster.pid file, but I feel like maybe I should save that as a last resort...
Admittedly part of the reason I'm not sure how to fix this is that I'm not really clear on what the postmaster even is - I'm just starting to learn about all of this.
Hopping into a Postgres database via the psql db_name command works just fine, for what it's worth.

Posting this in case it helps someone else:
I was having this same problem as the OP after a hard reboot when my laptop crashed. What helped me was running the following command to see what PID was associated with postmaster.pid:
cat /usr/local/var/postgres/postmaster.pid
The first number that appears will be the PID. Looking in Activity Monitor, I was able to see that Postgres was running, but without a PID number that matched the one shown.
Instead of the steps outlined in the answer referenced on Superuser, I restarted my laptop properly and then opened up Terminal and ran
brew services restart postgresql
This worked without having to remove postmaster.pid, which I saw a few other posts recommend. Sometimes it's the simple solutions that work.

I add here what worked for me, after a long time of searching:
Delete the postmaster.pid file:
rm /usr/local/var/postgres/postmaster.pid
Restart your postgres:
brew services restart postgresql
Hope this helps someone ...
Update 8/2022:
As Mike commented, for M1 Mac you would replace stage 1 with:
rm /opt/homebrew/var/postgresql/postmaster.pid
With M1 and specify Postgres Version # 14
rm -rf /opt/homebrew/var/postgresql#14/postmaster.pid

It often happens to me in OSx, when my system shutdown unexpectedly.
You can just remove the file postmaster.pid.
cd Library/Application Support/Postgres/var-{postgres-version}
and remove the postmaster.pid file
in case you use brew then your path should be something like:
/usr/local/var/postgres/postmaster.pid
restart the Postgres by using this command
pg_ctl -D /usr/local/var/postgres restart

Since you can connect to the database, you don't need to start the server again - it's already running.
pg_ctl is used to control the PostgreSQL server. Since your server is already started, your command:
pg_ctl -D /usr/local/var/postgres start
Returns an error, saying that there is a lock on postmaster.pid - which is true since there is already a server running under that PID.
There are two ways:
The most basic way - skip that step, your server is already running!
Executing a needless operation - stopping the server, and then starting it again.
You could stop your server doing :
pg_ctl -D /usr/local/var/postgres stop
So that you won't have the lock on postmaster anymore and you could use your command to start it again.

Postmaster is the main PostgreSQL process. You're trying to start PostgreSQL that's already running (and you're saying yourself you can connect to it). Just skip that step of your process.

When the system shutdown unexpectedly, my postgres crashs and i'm unable to connect to it.
What worked for me was:
1˚ Check postgres log:
tail -n 10000 /usr/local/var/log/postgres.log
2˚ Find the PID of postgress, should look like this:
FATAL: lock file "postmaster.pid" already exists
HINT: Is another postmaster (PID 707) running in data directory "/usr/local/var/postgres"?
3˚ Kill that process:
kill 707
4˚ Restart your postgres
brew services restart postgresql
After those steps i was able to connect to the database within my rails application.

If you got no important data to lose :
sudo killAll postgres
brew services restart postgresql
AGAIN : You could get data corrupted by doing this !
do it at your own risk !

I am using mac and these step work for me:-
step1: cd Library/Application\ Support/Postgres
(most commonly your Postgres installation will be located here)
step2: cd var-13
(if you are using version 12 then use cd var-12. Hope got the point)
step3: ls
(As you can see among the files you find the postmaster.pid, perfect.)
step4: rm postmaster.pid
When you have removed the stale postmaster.pid file you can restart PostgreSQL and everything should work as normal.

My OSX laptop had shutdown unexpectedly, and I was getting a stale postmaster.pid error in the PostgresApp. Shutting down my laptop and turning it back on again solved the problem.

After running the following commands
rm /usr/local/var/postgres/postmaster.pid
brew services restart postgresql
The error lock file "postmaster.pid" already exists comes up again.
When we run launchctl list | grep postgres
28618 0 homebrew.mxcl.postgresql
The existing file "postmaster.pid" was created by this daemon process hosted by launchctl.
We try to stop the homebrew.mxcl.postgresql through
sudo launchctl stop homebrew.mxcl.postgresql
launchctl disable homebrew.mxcl.postgresql
Unfortunately, none of them could stop the homebrew.mxcl.postgresql.
The reason is Disable and enable an agent using (persists between boots)
https://apple.stackexchange.com/questions/105892/disable-services-in-osx-services-msc
launchctl enable <name> or launchctl disable <name>
Two ways to solve it when the error lock file "postmaster.pid" already exists comes up again
In order to stop an agent immediately through
launchctl kill homebrew.mxcl.postgresql
Restart your desktop and run brew services start postgresql#14. Now, PostgreSQL could start successfully.
Hope it could help someone who met the same issue again.

This worked for me. First locate postmaster.pid (for me it was in the var directory as seen below, although it will be different on depending on your operating system). Then get rid of postmaster.pid, then kill the postgres process, then start/restart postgres service.
cd /var/lib/pgsql/data/
rm postmaster.pid
sudo pkill -u postgres
sudo systemctl start postgresql.service

If you have installed postgres with brew then simply run the following command and it will manage everything
brew services restart postgresql

Moving postgresql 9.4 data directory in fedora 22

I would like to know how to move the PostgreSQL database directory in a Fedora 22 installation. In Ubuntu, this is very straightforward by using pg_dropcluster and sudo pg_createcluster -d /the/new/location/ 9.4 main, but Fedora does not appear to have anything of the sort.
I copied the directory /var/lib/pgsql/data to a new location and, did the following:
su - postgres
service postgresql stop
export PGDATA="/the/new/location/"
echo $PGDATA # gives the correct /the/new/location/
service postgresql start
psql
show data_directory;
Which still results in the default /var/lib/pgsql/data...
Could someone please either provide a link to a relevant and up-to-date tutorial or explain how to complete the move?
Note: I am aware that this question has been answered for other distributions and older versions of Fedora and Psql, but it seems a lot of the files have been moved about and none of the approaches seem to work for me.

So it turns out that this method is out of date. The correct way to do it is to add this file with nano /etc/systemd/system/postgresql.service:
.include /lib/systemd/system/postgresql.service
[Service]
Environment=PGDATA=/the/new/location/
I believe this also requires a reboot.
Afterwards, one still has to set up SELinux correctly before PostgreSQL can start up.

Eliminating non-working PostgreSQL installations on Ubuntu 10.04 and starting afresh

I find I have the wreckage of two old PostgreSQL installations on Ubuntu 10.04:
$ pg_lsclustersVersion Cluster Port Status Owner Data directory Log file
Use of uninitialized value in printf at /usr/bin/pg_lsclusters line 38.
8.4 main 5432 down /var/lib/postgresql/8.4/main /var/log/postgresql/postgresql-8.4-main.log
Use of uninitialized value in printf at /usr/bin/pg_lsclusters line 38.
9.1 main 5433 down /var/lib/postgresql/9.1/main /var/log/postgresql/postgresql-9.1-main.log
$
Attempts to perform basic functions return errors, for instance:
createuser: could not connect to database postgres: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
More information comes when I try to start the database server:
$ sudo /etc/init.d/postgresql start
* Starting PostgreSQL 9.1 database server
* Error: The cluster is owned by user id 109 which does not exist any more
...fail!
$
My question: how do I completely remove both clusters and set up a new one? I've tried removing, purging, and reinstalling postgresql, following the advice here: https://stackoverflow.com/a/2748644/621762. Now pg_lsclusters shows no clusters in existence, but the No such file or directory error persists when I try to createuser, createdb or run psql. What have I failed to do?

First, that answer you linked to was pretty unsafe - hand-editing /etc/passwd ?!? dselect where an apt wildcard would do? Crazy stuff. I'm not surprised you're having issues.
As for the no such file or directory messages: You need to make sure you have a running PostgreSQL server ("cluster") before you can use admin commands like createdb, because they make a connection to the server. The No such file or directory message is telling you that the server doesn't exist or isn't running.
Here's what's happening:
Ubuntu uses pg_wrapper to manage multiple concurrent PostgreSQL instances. The issues you're having are really with pg_wrapper.
Ideally you would've just used pg_dropcluster to get rid of the unwanted clusters. Unfortunately, by following bad advice it sounds like you've got your system into a bit of a messed-up state where the PostgreSQL packages are half-installed and kind of mangled. You need to either repair the install, or totally clean it out.
I'd clean it out. I'd recommend:
Verify that pg_lsclusters lists no database clusters
apt-get --purge remove postgresql\* - this is important
Remove /etc/postgresql/
Remove /etc/postgresql-common
Remove /var/lib/postgresql
userdel -r postgres
groupdel postgres
apt-get install postgresql-common postgresql-9.1 postgresql-contrib-9.1 postgresql-doc-9.1
It's possible that the apt-get --purge step will fail because you've removed the user IDs, etc. Re-creating the postgres user ID with useradd -r -u 109 postgres should allow you to re-run the purge successfully then delete the user afterwards.

This answer is not directly about removing a postgres instance, rather, about resoliving the issue,
Error: The cluster is owned by user ...
I got this error while trying to spin up a docker container pointed to a postgres data directory that was produced via a different container (on a different host machine).
The error is directly related to directory ownership. In my case, the system was unable to find the user that certain postgres directories was owned by in the current environment. By re-owning those directories to the right user resolves the issue. Following is an example mapping (that worked for me):
chown -R postgres:postgres /var/lib/postgresql
chown -R postgres:postgres /etc/postgresql
chown -R postgres:postgres /var/log/postgresql
chown -R postgres:postgres /var/run/postgresql

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse