Google Coud SQL Postgres Database kills process and enters recovery mode - google-cloud-sql

I'm seeing my client's Cloud SQL database go into recovery mode every day or two. All connections are rejected with the error:
Every time this is proceeded by log messages stating:
server process (PID 2883588) was terminated by signal 9: Killed
The SQL query is logged, always the same query
terminating any other active server processes
Then many: terminating connection because of crash of another server process
At which point the database comes back in recovery mode for a few seconds, before continuing life as normal.
Why would Cloud SQL be terminating a query for me? I've run the query in question (on the same server) and it completes happily in under a millisecond. Could this be caused by transient server load?

Related

PostgreSQL - how to unlock table record AFTER shutdown

Sorry I looked for this everywhere but cannot find a working solution :/
I badly needed this for abnormal testing.
What I'm trying to do here is:
Insert row in TABLE A
Lock this record
(At separate terminal) service postgresql-9.6 stop
Wait a few moments
(At separate terminal) service postgresql-9.6 start
"try" to unlock the record by executing "COMMIT;" in the same terminal as #2.
How i did #2 is like this:
BEGIN;
SELECT * FROM TABLE A WHERE X=Y FOR UPDATE;
Problem is that once I did #6, this error shows up:
DB=# commit;
FATAL: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
So when I execute "COMMIT;" again, it only shows:
DB=# commit;
WARNING: there is no transaction in progress
COMMIT
Now the record cannot be unlocked.
I've tried getting the PID of that locking thing, and then execute pg_terminate (or cancel), but it just doesn't work.
DB=# select pg_class.relname,pg_locks.* from pg_class,pg_locks where pg_class.relfilenode=pg_locks.relation;
DB=# select pg_terminate_backend(2450);
FATAL: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
DB=# select pg_cancel_backend(3417);
ERROR: canceling statement due to user request
Please help. Does anyone have any ideas? :/
..Or is this even possible?
My specs:
Postgresql-9.6
RedHat Linux
There's a fundamental misunderstanding or three here. Lock state is not persistent.
When you lock a record (or table), the lock is associated with the transaction that took the lock. The transaction is part of the running PostgreSQL session, your connection to the server.
Locks are released at the end of transactions.
Transactions end:
On explicit COMMIT or ROLLBACK;
When a session disconnects without an explicit COMMIT of the open transaction, triggering an implied ROLLBACK;
When the server shuts down, terminating all active sessions, again triggering an implied ROLLBACK of all in-progress transactions.
Thus, you have released the lock you took at step 2 when you shut the server down at step 3. The transaction that acquired that lock no longer exists because its session was terminated by server shutdown.
If examine pg_locks you'll see how the locked row is present before restart and vanishes after restart.

Terminating connection because of crash of another server process -Postgres

Everytime i run the same query i'm getting this error:
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
connection to server was lost
I'm using phpPgAdmin 5.1.
A concurrent query in a different database session has crashed its server backend process. As a consequence, the whole database stops and performs crash recovery from the latest checkpoint.
You should look into the database server log to see what the problem is.

Postgres SIGKILL crash

FYI only; this does not need an answer.
I was working on a Postgres server under heavy load, and issued a GRANT command that hung. It was not blocked by any other commands.
I had a few open connections and was able to kill several of the processes with a normal pg_cancel_backend (SIGTERM) command, but my GRANT command didn't respond to either that or pg_terminate_backend (SIGINT). I finally tried "kill -9 (pid)" (SIGKILL) and the server crashed.
Issuing SIGKILL to the database server process or the postmaster can cause crashes--that's well documented. Running SIGKILL against a child process can also crash the database.
Running SIGKILL against a child process can also crash the database
Any fatal signal that terminates any backend without a chance to clean up, such as SIGSEGV, SIGABRT, SIGKILL, etc, will cause the postmaster to assume that shared memory may be corrupt. It will roll back all transactions, terminate all running backends, and restart.
PostgreSQL does that to protect your data. If something went wrong before a backend crashed that caused it to scribble on shared memory, then shared_buffers could contain invalid data that'd get flushed to disk and replace good pages.
I was pretty sure that was in the docs, but all I can find is what I think you were referring to in shutting down the server.
Anyway, if you SIGKILL a backend you'll see something like:
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
This also happens if the OOM killer kills a backend, which is why you should turn off memory overcommit on Linux.
I wrote some guidance on things to do and not to do with PostgreSQL on my blog. Worth a look.

Connection lost after query runs for few minutes in PostgreSQL

I am using PostgreSQL 8.4 and PostGIS 1.5. What I'm trying to do is INSERT data from one table to another (but not strictly the same data). For each column, a few queries are run and there are a total of 50143 rows stored in the table. But the query is quite resource-heavy: after the query has run for a few minutes, the connection is lost. Its happening about 21-22k MS into the execution of the query, after which I have to start the DBMS manually again. How should I go about solving this issue?
The error message is as follows:
[Err] server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
Additionally, here is the psql error log:
2013-07-03 05:33:06 AZOST HINT: In a moment you should be able to reconnect to the database and repeat your command.
2013-07-03 05:33:06 AZOST WARNING: terminating connection because of crash of another server process
2013-07-03 05:33:06 AZOST DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
My guess, reading your problem, is that you are hitting out of memory issues. Craig's suggestion to turn off overcommit is a good one. You may also need to reduce work_mem if this is a big query. This may slow down your query but it will free up memory. work_mem is per operation so a query can use many times that setting.
Another possibility is you are hitting some sort of bug in a C-language module in PostgreSQL. If this is the case, try updating to the latest version of PostGIS etc.

psql seems to timeout with long queries

I am performing a bulk copy into postgres with about 80GB of data.
\copy my_table FROM '/path/csv_file.csv' csv DELIMITER ','
Before the transaction is committed I get the following error.
Server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
In the PostgreSQL logs:
LOG:server process (PID 21122) was terminated by signal 9: Killed
LOG:terminating any other active server processes
WARNING:terminating connection because of crash of another server process
DETAIL:The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
Your backend process receiving a signal 9 (SIGKILL). This can happen if:
Somebody sends a kill -9 manually;
A cron job is set up to send kill -9 under some circumstances (very unsafe, do not do this); or
the Linux out-of-memory (OOM) killer triggers and terminates the process.
In the latter case you will see reports of OOM killer activity in the kernel's dmesg output. I expect this is what you'll see in your case.
PostgreSQL servers should be configured without virtual memory overcommit so that the OOM killer does not run and PostgreSQL can handle out-of-memory conditions its self. See the PostgreSQL documentation on Linux memory overcommit.
The separate question "why is this using so much memory" remains. Answering that requires more knowledge of your setup: how much RAM the server has, how much of it is free, what your work_mem and maintenance_work_mem settings are, etc. It isn't a very interesting problem to look into until you upgrade to the current PostgreSQL 8.4 patch release to make sure the problem isn't one that's already fixed.