I am using Yourkit profiler. I have taken the thread dump from it but the threads doesnt show the lock that they are holding and the locks for which they are waiting for.
Can anyone please help me how the thread dump os analysed?
Related
My slave database has undergone the memory crash(Out of memory) and in recovery stage.
I want to know the query that causes this issue.
I have checked logs I get one query just before the system goes into the recovery mode;But I want to confirm it.
I am using postgres 9.4
If any one has any idea?
If you set log_min_error_statement to error (the default) or lower, you will get the statement that caused the out-of-memory error in the database log.
But I assume that you got hit by the Linux OOM killer, that would cause a PostgreSQL process to be killed with signal 9, whereupon the database goes into recovery mode.
The correct solution here is to disable memory overcommit by setting vm.overcommit_ratio to 2 in /etc/sysctl.conf and activate the setting with sysctl -p (you should then also tune vm.overcommit_ratio correctly).
Then you will get an error rather than a killed process, which is easier to debug.
I have this issue that is driving me nuts. Despite all my efforts, I am not able to force my postgres server to shut down. I have followed those instructions : http://www.question-defense.com/2008/10/17/pg_ctl-server-does-not-shut-down-force-postgres-to-shutdown
but still, nothing happens and all I got in the shell is
waiting for server to shut down............................................................... failed
pg_ctl: server does not shut down
Any help much appreciated.
Update: Checking the logs, I have this recurring error :
LOG: checkpoints are occurring too frequently (25 seconds apart)
HINT: Consider increasing the configuration parameter "checkpoint_segments".
After giving it a lot of thoughts especially on the way I installed it at the first place, I realize that I set up the install so the daemon would launch postgres at the start of my machine. Thus, any manual killing would simply result in the recreation of those process by the same daemon.
To resolve this problem you need to stop the daemon from working using launchctl and remove a .plist file in your postgres directory.
Good luck if you face the same problem.
You probably run with the default setting of "checkpoint_segments = 3", that produces the warnings. Your database does many writes, right? It takes some time to write all of this to disk, and your database is quite busy rotating the logfiles, instead doing real work.
If you increase checkpint_segments, you will see performance improvements, and less I/O.
For further readings: https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
FYI only; this does not need an answer.
I was working on a Postgres server under heavy load, and issued a GRANT command that hung. It was not blocked by any other commands.
I had a few open connections and was able to kill several of the processes with a normal pg_cancel_backend (SIGTERM) command, but my GRANT command didn't respond to either that or pg_terminate_backend (SIGINT). I finally tried "kill -9 (pid)" (SIGKILL) and the server crashed.
Issuing SIGKILL to the database server process or the postmaster can cause crashes--that's well documented. Running SIGKILL against a child process can also crash the database.
Running SIGKILL against a child process can also crash the database
Any fatal signal that terminates any backend without a chance to clean up, such as SIGSEGV, SIGABRT, SIGKILL, etc, will cause the postmaster to assume that shared memory may be corrupt. It will roll back all transactions, terminate all running backends, and restart.
PostgreSQL does that to protect your data. If something went wrong before a backend crashed that caused it to scribble on shared memory, then shared_buffers could contain invalid data that'd get flushed to disk and replace good pages.
I was pretty sure that was in the docs, but all I can find is what I think you were referring to in shutting down the server.
Anyway, if you SIGKILL a backend you'll see something like:
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
This also happens if the OOM killer kills a backend, which is why you should turn off memory overcommit on Linux.
I wrote some guidance on things to do and not to do with PostgreSQL on my blog. Worth a look.
I am performing a bulk copy into postgres with about 80GB of data.
\copy my_table FROM '/path/csv_file.csv' csv DELIMITER ','
Before the transaction is committed I get the following error.
Server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
In the PostgreSQL logs:
LOG:server process (PID 21122) was terminated by signal 9: Killed
LOG:terminating any other active server processes
WARNING:terminating connection because of crash of another server process
DETAIL:The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
Your backend process receiving a signal 9 (SIGKILL). This can happen if:
Somebody sends a kill -9 manually;
A cron job is set up to send kill -9 under some circumstances (very unsafe, do not do this); or
the Linux out-of-memory (OOM) killer triggers and terminates the process.
In the latter case you will see reports of OOM killer activity in the kernel's dmesg output. I expect this is what you'll see in your case.
PostgreSQL servers should be configured without virtual memory overcommit so that the OOM killer does not run and PostgreSQL can handle out-of-memory conditions its self. See the PostgreSQL documentation on Linux memory overcommit.
The separate question "why is this using so much memory" remains. Answering that requires more knowledge of your setup: how much RAM the server has, how much of it is free, what your work_mem and maintenance_work_mem settings are, etc. It isn't a very interesting problem to look into until you upgrade to the current PostgreSQL 8.4 patch release to make sure the problem isn't one that's already fixed.
What does it mean when a PostgreSQL process is "idle in transaction"?
On a server that I'm looking at, the output of "ps ax | grep postgres" I see 9 PostgreSQL processes that look like the following:
postgres: user db 127.0.0.1(55658) idle in transaction
Does this mean that some of the processes are hung, waiting for a transaction to be committed? Any pointers to relevant documentation are appreciated.
The PostgreSQL manual indicates that this means the transaction is open (inside BEGIN) and idle. It's most likely a user connected using the monitor who is thinking or typing. I have plenty of those on my system, too.
If you're using Slony for replication, however, the Slony-I FAQ suggests idle in transaction may mean that the network connection was terminated abruptly. Check out the discussion in that FAQ for more details.
As mentioned here: Re: BUG #4243: Idle in transaction it is probably best to check your pg_locks table to see what is being locked and that might give you a better clue where the problem lies.