Can't run pg_resetxlog with corrupted Postgres - postgresql

I have potentially a corrupted Postgres database running in a docker container. This container mounts a data volume to /var/lib/postgresql/data.
I exec'd into the docker container and ran the command gosu postgres pg_resetxlog /var/lib/postgresql/data. The error I received is of the following:
pg_resetxlog: lock file "postmaster.pid" exists
Is a server running? If not, delete the lock file and try again.
I tried 3 things:
match the PID listed under postmaster.pid with my Postgres process and manually killing Postgres using kill <PID>. This did nothing to shut down Postgres.
Delete the postmaster.pid located under /var/lib/postgresql/data. This forced the container to restart but the same issue persists.
I ran docker restart <POSTGRES> to restart postgres.
None of the above did anything to help. What I'm trying to do is essentially have a way for this container to recover without completely destroying it and forcing it to start anew. I'm using Postgres:9.5 docker.
Any ideas?
EDIT: add container logs
Sep 11 18:23:34 VM postgres-container[1045]: LOG: incomplete startup packet
Sep 11 18:24:36 VM postgres-container[1045]: LOG: incomplete startup packet
Sep 11 18:24:46 VM postgres-container[1045]: LOG: received smart shutdown request
Sep 11 18:24:46 VM postgres-container[1045]: LOG: autovacuum launcher shutting down
Sep 11 18:24:58 VM postgres-container[1045]: FATAL: the database system is shutting down
Sep 11 18:25:01 VM postgres-container[1045]: FATAL: the database system is shutting down
Sep 11 18:25:39 VM postgres-container[1045]: LOG: incomplete startup packet
Sep 11 18:25:39 VM postgres-container[1045]: FATAL: the database system is shutting down
Sep 11 18:25:59 VM postgres-container[1045]: FATAL: the database system is shutting down
Sep 11 18:26:02 VM postgres-container[1045]: FATAL: the database system is shutting down
Sep 11 18:26:40 VM postgres-container[1045]: LOG: incomplete startup packet
Sep 11 18:26:40 VM postgres-container[1045]: FATAL: the database system is shutting down
Sep 11 18:27:00 VM postgres-container[1045]: FATAL: the database system is shutting down
Sep 11 18:27:03 VM postgres-container[1045]: FATAL: the database system is shutting down
Sep 11 18:27:41 VM postgres-container[1045]: LOG: incomplete startup packet
Sep 11 18:27:41 VM postgres-container[1045]: FATAL: the database system is shutting down
Sep 11 18:28:01 VM postgres-container[1045]: FATAL: the database system is shutting down
Sep 11 18:28:04 VM postgres-container[1045]: FATAL: the database system is shutting down
Sep 11 18:28:41 VM postgres-container[1045]: LOG: could not open file "postmaster.pid": No such file or directory
Sep 11 18:28:41 VM postgres-container[1045]: LOG: performing immediate shutdown because data directory lock file is invalid
Sep 11 18:28:41 VM postgres-container[1045]: LOG: received immediate shutdown request
Sep 11 18:28:41 VM postgres-container[1045]: WARNING: terminating connection because of crash of another server process
Sep 11 18:28:41 VM postgres-container[1045]: DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
Sep 11 18:28:41 VM postgres-container[1045]: HINT: In a moment you should be able to reconnect to the database and repeat your command.
Sep 11 18:28:41 VM postgres-container[1045]: LOG: incomplete startup packet
Sep 11 18:28:43 VM postgres-container[1045]: LOG: database system was interrupted; last known up at 2019-09-12 01:24:31 UTC
Sep 11 18:28:56 VM postgres-container[1045]: LOG: database system was not properly shut down; automatic recovery in progress
Sep 11 18:28:56 VM postgres-container[1045]: LOG: redo starts at 0/1C384970
Sep 11 18:28:56 VM postgres-container[1045]: LOG: invalid record length at 0/1C44DAC8
Sep 11 18:28:56 VM postgres-container[1045]: LOG: redo done at 0/1C44DAA0
Sep 11 18:28:56 VM postgres-container[1045]: LOG: last completed transaction was at log time 2019-09-12 01:24:42.2848+00
Sep 11 18:28:57 VM postgres-container[1045]: LOG: MultiXact member wraparound protections are now enabled
Sep 11 18:28:57 VM postgres-container[1045]: LOG: database system is ready to accept connections
Sep 11 18:28:57 VM postgres-container[1045]: LOG: autovacuum launcher started
Sep 11 18:29:42 VM postgres-container[1045]: LOG: incomplete startup packet
Sep 11 18:30:45 VM postgres-container[1045]: LOG: incomplete startup packet
Sep 11 18:31:47 VM postgres-container[1045]: LOG: incomplete startup packet

Good News thestateofmay !!! The server is up and running :)
You can connect to the container , use
docker exec -ti postgres /bin/bash
you should be user root ,switch to postgres
su postgres
then start
psql
and check if the data is present in the server

Related

Postgres-15.1 is restarting continuously on using shared_preload_libraries extension

Postgres is restarting continuously on using shared_preload_libraries extension.
https://postgresqlco.nf/doc/en/param/shared_preload_libraries/
I am running postgres-15.1 using a python-based daemon in CentOS7-32bit arch. It is working fine if we do not use "shared_preload_libraries" extension. But after enabling this extension using "ALTER SYSTEM SET shared_preload_libraries" command, the postgres is restarting every few seconds.
Initially it was working fine with postgres-9.6.4.
Postgres logs:
waiting for server to start....2023-02-15 07:13:45.676 GMT [28605] LOG: skipping missing configuration file "/home/runtime/pgsql/data/postgresql.auto.conf"
2023-02-15 07:13:45.825 GMT [28605] LOG: starting PostgreSQL 15.1 on i686-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 32-bit
2023-02-15 07:13:45.825 GMT [28605] LOG: listening on IPv4 address "127.0.0.1", port 5432
2023-02-15 07:13:45.933 GMT [28605] LOG: listening on Unix socket "/home/runtime/pgsql/.s.PGSQL.5432"
2023-02-15 07:13:45.969 GMT [28608] LOG: database system was shut down at 2023-02-15 07:13:35 GMT
2023-02-15 07:13:45.989 GMT [28605] LOG: database system is ready to accept connections
done
server started
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
2023-02-15 07:13:51.480 GMT [28605] LOG: received fast shutdown request
waiting for server to shut down....2023-02-15 07:13:51.512 GMT [28605] LOG: aborting any active transactions
2023-02-15 07:13:51.513 GMT [28605] LOG: background worker "logical replication launcher" (PID 28611) exited with exit code 1
2023-02-15 07:13:51.513 GMT [28606] LOG: shutting down
2023-02-15 07:13:51.536 GMT [28606] LOG: checkpoint starting: shutdown immediate
2023-02-15 07:13:51.908 GMT [28606] LOG: checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.090 s, sync=0.028 s, total=0.395 s; sync files=2, longest=0.021 s, average=0.014 s; distance=0 kB, estimate=0 kB
2023-02-15 07:13:51.909 GMT [28605] LOG: database system is shut down
done
server stopped
I tried to use postgres-15.0 and postgres-14.4, got the same behavior with both. I am not able to find any open issues w.r.t. shared_preload_libraries extension with new versions of Postgres.
PS: I have built this Postgres from the source code with openssl-1.1.1i.
I am using "citus" library with this.
ALTER SYSTEM SET shared_preload_libraries="citus";
I have generated a new citus.so file from it's source code using postgres-15.1. github.com/citusdata/citus

firebird3.0.service failed because the control process exited with error code. How to start firebird?

I'm working with firebird3.0 database suddenly my database is stopped working and when i have checked server status by
$ /etc/init.d/firebird3.0 status
i see server is stopped
● firebird3.0.service - Firebird Database Server ( SuperServer )
Loaded: loaded (/lib/systemd/system/firebird3.0.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2019-05-16 19:01:13 IST; 29s ago
Process: 9628 ExecStart=/usr/sbin/fbguard -pidfile /run/firebird3.0/default.pid -daemon -forever (code=exited, status=252)
May 16 19:00:58 ADMIN-I-61 systemd1: Starting Firebird Database Server ( SuperServer )...
May 16 19:01:13 ADMIN-I-61 systemd1: firebird3.0.service: Control process exited, code=exited status=252
May 16 19:01:13 ADMIN-I-61 systemd1: Failed to start Firebird Database Server ( SuperServer ).
May 16 19:01:13 ADMIN-I-61 systemd1: firebird3.0.service: Unit entered failed state.
May 16 19:01:13 ADMIN-I-61 systemd1: firebird3.0.service: Failed with result 'exit-code'.
when i'm trying following commands to start server
/etc/init.d/firebird3.0 start
/etc/init.d/firebird3.0 restart
it returns me
[....] Starting firebird3.0 (via systemctl): firebird3.0.serviceJob for firebird3.0.service failed because the control process exited with error code. See "systemctl status firebird3.0.service" and "journalctl -xe" for details.
failed!
My today's firebird.log file is looks like this
ADMIN-I-61 Thu May 16 11:06:37 2019
/opt/firebird/bin/fbguard: guardian starting /opt/firebird/bin/firebird
ADMIN-I-61 Thu May 16 11:07:26 2019
INET/inet_error: bind errno = 98
ADMIN-I-61 Thu May 16 11:07:27 2019
startup:INET_connect:
Unable to complete network request to host "ADMIN-I-61".
Error while listening for an incoming connection.
Address already in use
ADMIN-I-61 Thu May 16 11:07:27 2019
/opt/firebird/bin/fbguard: /opt/firebird/bin/firebird terminated due to startup error (2)
ADMIN-I-61 Thu May 16 11:07:27 2019
/opt/firebird/bin/fbguard: /opt/firebird/bin/firebird terminated due to startup error (2)
ADMIN-I-61 Thu May 16 12:22:35 2019
/opt/firebird/bin/fbguard: guardian starting /opt/firebird/bin/firebird
I have check ports
please help...!
When installing the firebird from the package deb, the line in the file /etc/firebird/3.0/firebird.conf was uncommented:
RemoteBindAddress = localhost
Comment out this line:
**#RemoteBindAddress = localhost**
Default:
RemoteBindAddress =
After changes, you must restart the service firebird.

Restart PostgreSQL without postgresql-server

I'm on CentOS 7 and I'm trying to get through the 'PG::ConnectionBad: FATAL: Peer authentication failed for user' error.
So I've already figured out that I should change pg_hba.conf (peer to md5) and I've done it. It seems that I have to restart postgres but it is not so easy as I thought.
I tried 'service postgresql restart' which resulted in 'Failed to restart postgresql.service: Unit not found.'
Then tried to install posgresql-server. Got:
oct 23 01:16:15 serverct1 pg_ctl[3280]: HINT: Is another postmaster already running on port 5432? If ...try.
oct 23 01:16:15 serverct1 pg_ctl[3280]: WARNING: could not create listen socket for "localhost"
oct 23 01:16:15 serverct1 pg_ctl[3280]: FATAL: could not create any TCP/IP sockets
oct 23 01:16:16 serverct1 pg_ctl[3280]: pg_ctl: could not start server
oct 23 01:16:16 serverct1 systemd[1]: postgresql.service: control process exited, code=exited status=1
oct 23 01:16:16 serverct1 systemd[1]: Failed to start PostgreSQL database server.
About 5432 port usage:
postgres 5432/tcp postgresql # POSTGRES
postgres 5432/udp postgresql # POSTGRES
So I'm curious:
1) Do postgresql and postgresql-server work separately?
2) Is it possible to restart posgresql without postgresql-server?
3) If not - how to get the port 5432 free in order to run postgresql-server?
You can avoid troubles with serverct1 if you use standard postgres pg_ctl, eg:
pg_ctl reload
Or if needed pg_ctl reload -D $PGDATA
You dont need to restart the postgres for pg_hba changes to apply: https://www.postgresql.org/docs/current/static/auth-pg-hba-conf.html
The pg_hba.conf file is read on start-up and when the main server
process receives a SIGHUP signal. If you edit the file on an active
system, you will need to signal the postmaster (using pg_ctl reload or
kill -HUP) to make it re-read the file.

Restore postgres db from folder

i have an old copy of my postgresql db folder (/var/lib/postgresql/9.5/main/) from my server. Now I want to get the data out of the files. So i copied the main folder to my local machine and changed the postgresql config (/etc/postgresql/9.5/main/postgresql.conf) to point to that directory. Also i changed the permission of the main directory to the user postgres. After restarting the postgresql service (sudo service postgresql restart) it doesn't really work.
What I'm doing wrong? (Yea I know, pg_dump is the preferred way, but in this way...)
So my question, does this even work?
Or is there a other way to get the data out of this?
everything is done on ubuntu 16.04.
Edit:
the log file after changing the postgresql.conf file to point to the new directory.
2017-10-13 06:15:43 CEST [968-1] LOG: database system was shut down at 2017-10-13 00:21:04 CEST
2017-10-13 06:15:43 CEST [968-2] LOG: MultiXact member wraparound protections are now enabled
2017-10-13 06:15:43 CEST [959-1] LOG: database system is ready to accept connections
2017-10-13 06:15:43 CEST [975-1] LOG: autovacuum launcher started
2017-10-13 06:15:43 CEST [983-1] [unknown]#[unknown] LOG: incomplete startup packet
2017-10-13 06:47:55 CEST [975-2] LOG: autovacuum launcher shutting down
2017-10-13 06:47:55 CEST [959-2] LOG: received smart shutdown request
2017-10-13 06:47:55 CEST [972-1] LOG: shutting down
2017-10-13 06:47:55 CEST [972-2] LOG: database system is shut down
2017-10-13 06:47:55 CEST [4667-1] FATAL: database files are incompatible with server
2017-10-13 06:47:55 CEST [4667-2] DETAIL: The database cluster was initialized without USE_FLOAT8_BYVAL but the server was compiled with USE_FLOAT8_BYVAL.
2017-10-13 06:47:55 CEST [4667-3] HINT: It looks like you need to recompile or initdb.
Ok that pointed me to this. The server is a armv7l, whereas the local machine is x86_64 (uname -m). So there is no chance to get the data out of it?
thx, Luc
If it's really true that your data directory is from an ARM7l system, and your local system is x86_64, you're going to have some difficulties.
The immediate error about USE_FLOAT8_BYVAL is because ARM7L is 32-bit, and cannot pass 64-bit floating point values (8 byte) by-value. Your 64-bit host can. But if you recompiled a custom postgres with USE_FLOAT8_BYVAL disabled you'd likely just run into other issues.
I suggest installing PostgreSQL on a matching ARM system to recover the data. Data directories for PostgreSQL are not portable across architectures (for performance reasons).
If you do not have access to the ARM system anymore, an emulator like qemu should be able to help you.
Otherwise, maybe you can compile a modified PostgreSQL (probably starting with 32-bit x86) that can read the data-dir, with appropriate configure options etc. I've never needed to try this.

Starting mongodb cluster

I'm starting 3 mongod processes on 3 different machines and try to run the mongos process on the another machine that runs also the application server.
I'm getting this message:
~$ mongos --configdb mongo1:27017,mongo2:27017,mongo3:27017
Mon Sep 24 10:34:05 mongos db version v2.0.4, pdfile version 4.5 starting (--help for usage)
Mon Sep 24 10:34:05 git version: nogitversion
Mon Sep 24 10:34:05 build info: Linux yellow 2.6.24-29-server #1 SMP Tue Oct 11 15:57:27 UTC 2011 x86_64 BOOST_LIB_VERSION=1_46_1
Mon Sep 24 10:34:09 ERROR: config servers mongo1:27017 and mongo2:27017 differconfig servers mongo1:27017 and mongo2:27017 differconfig servers mongo1:27017 and mongo2:27017 differconfig servers mongo1:27017 and mongo2:27017 differconfig servers not in sync! config servers mongo1:27017 and mongo2:27017 differ
chunks: "d41d8cd98f00b204e9800998ecf8427e" EOO
EOO EOO
configServer startup check failed
I've deleted mongod.lock from all config servers and restarted mongod.
That solved the problem.
There are 2 ways it can be done,
1) clear the data folder and restart the server, it will start as usual without any error,
2) delete the mongod lock file and Use the --repair option.