Postgresql postmaster takes very long time to start - postgresql

My situation in Postgres 9.1 is that
I had some running queries which take a lot of memory, then I restarted Postgres immediately. After that I used postmaster to start it again.
/usr/pgsql-9.1/bin/postmaster -p 5432 -D /var/lib/pgsql/9.1/data
However, it got stuck for a very long time without starting successfully.
From Postgres documentation, when we restart server immediately, it may prevent postmaster from freeing the system resources (e.g., shared memory and semaphores).
So what should I do now? or just wait for postmaster starting?

Much depends on how you stopped the server.
If you kill -9 the postmaster process, all the spawned processes (1 for each session) that are using the resources keep on running so your system is still overloaded. Furthermore, the postmaster process writes its pid to a file very aptly called postmaster.pid in the data directory of the cluster (/var/lib/pgsql/9.1/data in your case). When a file with that name is found, a new postmaster process can not start, so you should delete that file. All-in-all, NEVER kill the postmaster.
If you used pg_ctl or the rc.d script to restart the server then you have some other issue on your server. Reboot the server if you can.
In either case, you should not start the server by running the postmaster manually. Use the rc.d script or pg_ctl instead.
Lastly, 9.1 is getting old. Consider upgrading to 9.5 which has lots more features.

Related

Postgres data still "in use" after server stop

I am running postgresql in a docker container. Now I wanted to add checksums to the database cluster. So I stopped the docker container and waited some time. But the pg_checksums tool is still complaining:
pg_checksums: error: cluster must be shut down
There is no postgres or similar running any longer, with docker or not.
Renaming the file postmaster.pid did not change anything.
What du I need to do to convince pg_checksums that it can savely work on the cluster data?
I'm using postgresql 12 and Docker version 19.03.13, build 4484c46d9d on a CentOS 8 machine.
You need to shutdown the database cleanly. Shutting down the container itself apparently did not do that. A tool which does not attach to PostgreSQL's shared memory has no way to know whether a database has crashed, or is still running. So you need a clean shutdown.

FATAL ERROR lock file "postmaster.pid" already exists

I have recently installed PostGIS on my Mac (El Capitan 10.11.4, Postgres is version 9.5.1) using Homebrew, and I am following these instructions - http://morphocode.com/how-to-install-postgis-on-mac-os-x/
When I try to start Postgres using
pg_ctl -D /usr/local/var/postgres start
I get the following error:
$ FATAL: lock file "postmaster.pid" already exists
HINT: Is another postmaster (PID 280) running in data directory "/usr/local/var/postgres"?
So I spent a few hours researching how to address this, but to no avail.
Notably, I tried to kill the PID as recommended in an answer on Superuser - https://superuser.com/questions/553045/fatal-lock-file-postmaster-pid-already-exists- (in the case above, I ran kill 208), but as soon as I tried to start Postgres again, I got the same error, albeit with a different PID number. I saw a few people recommended deleting the postmaster.pid file, but I feel like maybe I should save that as a last resort...
Admittedly part of the reason I'm not sure how to fix this is that I'm not really clear on what the postmaster even is - I'm just starting to learn about all of this.
Hopping into a Postgres database via the psql db_name command works just fine, for what it's worth.
Posting this in case it helps someone else:
I was having this same problem as the OP after a hard reboot when my laptop crashed. What helped me was running the following command to see what PID was associated with postmaster.pid:
cat /usr/local/var/postgres/postmaster.pid
The first number that appears will be the PID. Looking in Activity Monitor, I was able to see that Postgres was running, but without a PID number that matched the one shown.
Instead of the steps outlined in the answer referenced on Superuser, I restarted my laptop properly and then opened up Terminal and ran
brew services restart postgresql
This worked without having to remove postmaster.pid, which I saw a few other posts recommend. Sometimes it's the simple solutions that work.
I add here what worked for me, after a long time of searching:
Delete the postmaster.pid file:
rm /usr/local/var/postgres/postmaster.pid
Restart your postgres:
brew services restart postgresql
Hope this helps someone ...
Update 8/2022:
As Mike commented, for M1 Mac you would replace stage 1 with:
rm /opt/homebrew/var/postgresql/postmaster.pid
With M1 and specify Postgres Version # 14
rm -rf /opt/homebrew/var/postgresql#14/postmaster.pid
It often happens to me in OSx, when my system shutdown unexpectedly.
You can just remove the file postmaster.pid.
cd Library/Application Support/Postgres/var-{postgres-version}
and remove the postmaster.pid file
in case you use brew then your path should be something like:
/usr/local/var/postgres/postmaster.pid
restart the Postgres by using this command
pg_ctl -D /usr/local/var/postgres restart
Since you can connect to the database, you don't need to start the server again - it's already running.
pg_ctl is used to control the PostgreSQL server. Since your server is already started, your command:
pg_ctl -D /usr/local/var/postgres start
Returns an error, saying that there is a lock on postmaster.pid - which is true since there is already a server running under that PID.
There are two ways:
The most basic way - skip that step, your server is already running!
Executing a needless operation - stopping the server, and then starting it again.
You could stop your server doing :
pg_ctl -D /usr/local/var/postgres stop
So that you won't have the lock on postmaster anymore and you could use your command to start it again.
Postmaster is the main PostgreSQL process. You're trying to start PostgreSQL that's already running (and you're saying yourself you can connect to it). Just skip that step of your process.
When the system shutdown unexpectedly, my postgres crashs and i'm unable to connect to it.
What worked for me was:
1˚ Check postgres log:
tail -n 10000 /usr/local/var/log/postgres.log
2˚ Find the PID of postgress, should look like this:
FATAL: lock file "postmaster.pid" already exists
HINT: Is another postmaster (PID 707) running in data directory "/usr/local/var/postgres"?
3˚ Kill that process:
kill 707
4˚ Restart your postgres
brew services restart postgresql
After those steps i was able to connect to the database within my rails application.
If you got no important data to lose :
sudo killAll postgres
brew services restart postgresql
AGAIN : You could get data corrupted by doing this !
do it at your own risk !
I am using mac and these step work for me:-
step1: cd Library/Application\ Support/Postgres
(most commonly your Postgres installation will be located here)
step2: cd var-13
(if you are using version 12 then use cd var-12. Hope got the point)
step3: ls
(As you can see among the files you find the postmaster.pid, perfect.)
step4: rm postmaster.pid
When you have removed the stale postmaster.pid file you can restart PostgreSQL and everything should work as normal.
My OSX laptop had shutdown unexpectedly, and I was getting a stale postmaster.pid error in the PostgresApp. Shutting down my laptop and turning it back on again solved the problem.
After running the following commands
rm /usr/local/var/postgres/postmaster.pid
brew services restart postgresql
The error lock file "postmaster.pid" already exists comes up again.
When we run launchctl list | grep postgres
28618 0 homebrew.mxcl.postgresql
The existing file "postmaster.pid" was created by this daemon process hosted by launchctl.
We try to stop the homebrew.mxcl.postgresql through
sudo launchctl stop homebrew.mxcl.postgresql
launchctl disable homebrew.mxcl.postgresql
Unfortunately, none of them could stop the homebrew.mxcl.postgresql.
The reason is Disable and enable an agent using (persists between boots)
https://apple.stackexchange.com/questions/105892/disable-services-in-osx-services-msc
launchctl enable <name> or launchctl disable <name>
Two ways to solve it when the error lock file "postmaster.pid" already exists comes up again
In order to stop an agent immediately through
launchctl kill homebrew.mxcl.postgresql
Restart your desktop and run brew services start postgresql#14. Now, PostgreSQL could start successfully.
Hope it could help someone who met the same issue again.
This worked for me. First locate postmaster.pid (for me it was in the var directory as seen below, although it will be different on depending on your operating system). Then get rid of postmaster.pid, then kill the postgres process, then start/restart postgres service.
cd /var/lib/pgsql/data/
rm postmaster.pid
sudo pkill -u postgres
sudo systemctl start postgresql.service
If you have installed postgres with brew then simply run the following command and it will manage everything
brew services restart postgresql

Clearing lock and restarting Mongo service

I'm trying to connect to my db remotely, and having some trouble. A popular answer seems to be running
sudo rm /var/lib/mongodb/mongod.lock
sudo service mongodb restart
What are the consequences of running each of these commands? I am especially wondering:
Am I introducing potential problems by deleting the lock? Surely the lock must be there for a useful reason?
Will my data stay the same after the restart?
If mongodb is not running yet the lock file still exists, then your mongod service did not shut down gracefully (crashed). In this case, you can safely delete the lock file and your data will remain.

postgres sometimes will not start after reboot

Something weird is happening with my postgres installation after I upgraded to version 9.3.2 homebrew.
Sometimes and not every time, if I enter psql I get this error message:
could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/tmp/.s.PGSQL.5432"?
It also appears that postgres is running because if I try and stop postgres with this command:
pg_ctl -D /usr/local/var/postgres stop -s -m fast
I get this response:
pg_ctl: server does not shut down
If I look in the server.log I can see the following entries:
FATAL: lock file "postmaster.pid" already exists HINT: Is another
postmaster (PID 208) running in data directory
"/usr/local/var/postgres"?
After some frantic googling, I am able to cure this by entering these commands:
launchctl unload -w ~/Library/LaunchAgents/homebrew.mxcl.postgresql.plist
rm ~/Library/LaunchAgents/homebrew.mxcl.postgresql.plist
I would like to first of all understand what is happening and second of all I would like to fix it once and for all.
Can anyone explain what might be happening and a cure?
First, try to pull up a log on PostgreSQL, on OSX (and with PG 9.3) this will most likely be here:
/Library/PostgreSQL/9.3/data/pg_log
Check this log and see if there are any entires in it that may explain more of what is happening. Post it here or in a past bin somewhere. This can help the community to debug the issue.
Next, the reason why 'pg_ctl' says the server is not shutdown is because you have a stale socket file and/or a stale .plist file that did not get removed on last shutdown. This may indicate that the PostgreSQl daemon was "crashed" or was shutdown by force and did not have a clean finish.
The commands you used removed the stale files and thus made way for a new socket file to be created.
Also it is important to know if this was a minor or a major upgrade of PostgreSQL. Major upgrades always require a migration of the data directory. Running an old PostgreSQL data directory with a new engine might have unexpected results.

How do I fix Postgres so it will start after an abrupt shutdown?

Due to a sudden power outage, the Postgres server running on my local machine shut down abruptly. After rebooting, I tried to restart Postgres and I get this error:
$ pg_ctl -D /usr/local/pgsql/data restart
pg_ctl: PID file "/usr/local/pgsql/data/postmaster.pid" does not exist
Is server running?
starting server anyway
server starting
$:/usr/local/pgsql/data$ LOG: database system shutdown was interrupted at 2009-02-28 21:06:16
LOG: checkpoint record is at 2/8FD6F8D0
LOG: redo record is at 2/8FD6F8D0; undo record is at 0/0; shutdown FALSE
LOG: next transaction ID: 0/1888104; next OID: 1711752
LOG: next MultiXactId: 2; next MultiXactOffset: 3
LOG: database system was not properly shut down; automatic recovery in progress
LOG: redo starts at 2/8FD6F918
LOG: record with zero length at 2/8FFD94A8
LOG: redo done at 2/8FFD9480
LOG: could not fsync segment 0 of relation 1663/1707047/1707304: No such file or directory
FATAL: storage sync failed on magnetic disk: No such file or directory
LOG: startup process (PID 5465) exited with exit code 1
LOG: aborting startup due to startup process failure
There is no postmaster.pid file in the data directory. What possibly could be the reason for this sort of behavior and of course what is the way out?
You'd need to pg_resetxlog. Your database can be in an inconsistent state after this though, so dump it with pg_dumpall, recreate and import back.
A cause for this could be:
You have not turned off hardware
write cache on disk, which often
prevents the OS from making sure data is written before it reports successful write to application. Check with
hdparm -I /dev/sda
If it shows "*" before "Write cache" then this could be the case. Source of PostgreSQL has a program src/tools/fsync/test_fsync.c, which tests speed of syncing data with disk. Run it - if it reports all times shorter than, say, 3 seconds than your disk is lying to OS - on a 7500rpm disks a test of 1000 writes to the same place would need at least 8 seconds to complete (1000/(7500rpm/60s)) as it can only write once per route. You'd need to edit this test_fsync.c if your database is on another disk than /var/tmp partition - change
#define FSYNC_FILENAME "/var/tmp/test_fsync.out"
to
#define FSYNC_FILENAME "/usr/local/pgsql/data/test_fsync.out"
Your disk is failing and has a bad block, check with badblocks.
You have a bad RAM, check with memtest86+ for at least 8 hours.
Reading a few similar messages in the archives of the PostgreSQL
mailing list ("storage sync failed on magnetic disk: No such file or
directory") seems to indicate that there is a very serious hardware
trouble, much worse than a simple power failure. You may have to prepare yourself to restore from backups.
Had db corruption too, my actions
docker run -it --rm -v /path/to/db:/var/lib/postgresql/data postgres:10.3 bash
su - postgres
/usr/lib/postgresql/10/bin/pg_resetwal -D /var/lib/postgresql/data -f
I had this same problem and I was about to dump, reinstall and import from db dump (a really painfull process), however I just tried this as the last resource and it worked!
brew services start postgresql
Then I restarted and that was it.
Run start instead of restart.
Execute the below command:
$pg_ctl -D /usr/local/pgsql/data start
Had this problem a couple of times, when my laptop turned off unexpectedly, when on very low battery while running PSQL in the background.
My solution after searching all over was, Hard delete and Reinstall, then import data from db dump.
Steps for Mac with brew to uninstall and reinstall psql 9.6
brew uninstall postgresql#9.6
rm -rf rm -rf /usr/local/var/postgresql#9.6
rm -rf .psql.local .psql_history .psqlrc.local l.psqlrc .pgpass
brew install postgresql#9.6
echo 'export PATH="/usr/local/opt/postgresql#9.6/bin:$PATH"' >> ~/.bash_profile
source ~/.bash_profile
brew services start postgresql#9.6
createuser -s postgres
createuser {ENTER_YOUR_USER_HERE} --interactive
As others stated, a stop + start instead of a restart worked for me. In a Docker environment this would be:
docker stop <container_name>
docker start <container_name>
or when using Docker Compose:
docker-compose stop
docker-compose start