PostgreSQL How to check if my function is still running - postgresql

Just a quick question .
I have one heavy function in PostgreSQL 9.3 , how I can check if the function is still running after several hours and how to run a function in background in psql ( my connection is unstable from time to time)
Thanks

For long running functions, it can be useful to have them RAISE LOG or RAISE NOTICE from time to time, indicating progress. If they're looping over millions of records, you might emit a log message every few thousand records.
Some people also (ab)use a SEQUENCE, where they get the nextval of the sequence in their function, and then directly read the sequence value to check progress. This is crude but effective. I prefer logging whenever possible.
To deal with disconnects, run psql on the remote side over ssh rather than connecting to the server directly over the PostgreSQL protocol. As Christian suggests, use screen so the remote psql doesn't get killed when the ssh session dies.
Alternately, you can use the traditional unix command nohup, which is available everywhere:
nohup psql -f the_script.sql </dev/null &
which will run psql in the background, writing all output and errors to a file named nohup.out.
You may also find that if you enable TCP keepalives, you don't lose remote connections anyway.

pg_stat_activity is a good hint to check if your function is still running. Also use screen or tmux on the server to ensure that it will survive a reconnect.

1 - Login to the psql console.
$ psql -U user -d database
2- Issue a \x command to format the results.
3- SELECT * from pg_stat_activity;
4- Scroll until you see your function name on the list. It should have an active status.
5- Check if there are any blocks on the table that your function relies on using:
SELECT k_locks.pid AS pid_blocking, k_activity.usename AS user_blocking, k_activity.query AS query_blocking, locks.pid AS pid_blocked, activity.usename AS user_blocked, activity.query AS query_blocked, to_char(age(now(), activity.query_start), 'HH24h:MIm:SSs') AS blocking_age FROM pg_catalog.pg_locks locks JOIN pg_catalog.pg_stat_activity activity ON locks.pid = activity.pid JOIN pg_catalog.pg_locks k_locks ON locks.locktype = k_locks.locktype and locks.database is not distinct from k_locks.database and locks.relation is not distinct from k_locks.relation and locks.page is not distinct from k_locks.page and locks.tuple is not distinct from k_locks.tuple and locks.virtualxid is not distinct from k_locks.virtualxid and locks.transactionid is not distinct from k_locks.transactionid and locks.classid is not distinct from k_locks.classid and locks.objid is not distinct from k_locks.objid and locks.objsubid is not distinct from k_locks.objsubid and locks.pid <> k_locks.pid JOIN pg_catalog.pg_stat_activity k_activity ON k_locks.pid = k_activity.pid WHERE k_locks.granted and not locks.granted ORDER BY activity.query_start;

Related

Postgres schema access changes before and after pg_dump

I have a Postgres database defined with the public schema and it is accessed from within a Python application. Before doing a pg_dump, the tables can be accessed without using a qualified table name, for example select * from user works. After doing the pg_dump, select * from user fails with a relation "user" does not exist error, but select * from public.user works. No restore has been performed.
Since this is within an application, I cannot change the access to include the schema. The application uses sqlalchemy and pgbouncer for interacting with the database.
In trying to figure out what's happening, I've discovered that running pg_dump causes the session to change. Before running the command, by querying pg_stat_activity, I can see there are 10 sessions in the pool, one active and nine idle. After running the command, a different session is active and the other nine are idle. Also, the settings in pg_db_role_setting and pg_user look correct for the session that I can see. But, even when those look correct, the query select * from user fails.
Also, just for reference, the code currently does not contain pg_dump and runs fine. As soon as I add the pg_dump, I see the issues mentioned.
Is there anything in pg_dump that could be causing what I'm seeing or is it just a result of going to another session? Since these are some of the first commands being run after running migrations, is there any reason the sessions should have different settings? What else am I missing?
Thanks!

Canceling running queries from cli

I am looking for a way to cancel queries that are currently running from cli.
I have found these links:
https://www.postgresql.org/docs/11/libpq-cancel.html
https://www.postgresql.org/docs/11/contrib-dblink-cancel-query.html
but seems that it is not what I am looking for.
Given the pid of the session running the query (the process ID of the corresponding backend process, which you can find in pg_stat_activity, or from ps, top, etc.), you can use:
psql -c "SELECT pg_cancel_backend(<your_pid>)"
If you're trying to kill all queries meeting some criteria (e.g. those which have been running/blocking/idle for some period of time, or those running against a particular database), something like this is often useful:
psql -c "SELECT pg_cancel_backend(pid) FROM pg_stat_activity WHERE <your_conditions>"
You can also disconnect them using pg_terminate_backend(pid).
To cancel the most recently started query:
SELECT pg_cancel_backend(pid)
FROM pg_stat_activity
WHERE pid <> pg_backend_pid()
ORDER BY query_start DESC
LIMIT 1;
pg_backend_pid() is the connection you're using to run the command; without this filter, the "latest query" would be the one you're currently executing.

Get all database names through JDBC

Is there any way how to get all database names out of a postgres database using JDBC? I can get the current one, but thats not what I am looking for...
I have a jUnit rule, which creates database for each test and after the test it drops it, but in some special cases, when the JVM dies, the drop never happens. So I'd like to check in the rule also existing database and clean some, which are not used any more. What I am looking for is some \l metacommand (but I can't easily ssh to the machine from unit tests...)
What would be also a solution for me would be some database ttl, something like some amqp queues have, but I suppose thats not in postgres either...
Thanks
Just run:
select datname
from pg_database
through JDBC. It returns all databases on the server you are connected to.
If you know how to get the information you want through a psql meta command (e.g. \l) just run psql with the -E switch - all internal SQL queries for the meta commands are then printed to the console.
-l actually uses a query that is a bit more complicated, but to only the the names, the above is sufficient

How to handle large result sets with psql?

I have a query which gives about 14M rows (I was not aware of this). When I use psql to run the query, my Fedora machine froze. Also after the query was done, I could not use Fedora anymore and had to restart my machine. When I redirected standard output to a file, Fedora also froze.
So how should I handle large resultsets with psql?
psql accumulates complete results in client memory by default. This behavior is usual for all libpq based Postgres applications or drivers. The solutions are cursors - then you are fetching only N rows from server. Cursors can be used by psql too. You can change it by setting FETCH_COUNT variable, then it will use cursors with batch retrieval size FETCH_COUNT.
postgres=# \set FETCH_COUNT 1000
postgres=# select * from generate_series(1,100000); -- big query

Force client disconnect using PostgreSQL

Is there a way to force clients to disconnect from PostgreSQL? I'm looking for the equivlent of DB2's force application all.
I'd like to do this on my development box because when I've got database consoles open, I can't load a database dump. I have to quit them first.
Kills idle processes in PostgreSQL 8.4:
SELECT procpid, (SELECT pg_terminate_backend(procpid)) as killed from pg_stat_activity
WHERE current_query LIKE '<IDLE>';
Combine pg_terminate_backend function and the pg_stat_activity system view.
This SO answer beautifully explains (full quote from araqnid between the horizontal rules, then me again):
To mark database 'applogs' as not accepting new connections:
update pg_database set datallowconn = false where datname = 'applogs';
Another possibility would be to revoke 'connect' access on the database for the client role(s).
Disconnect users from database = kill backend. So to disconnect all other users from "applogs" database, for example:
select pg_terminate_backend(procpid)
from pg_stat_activity
where datname = 'applogs' and procpid <> pg_backend_pid();
Once you've done both of those, you are the only user connected to 'applogs'. Although there might actually be a delay before the backends actually finish disconnecting?
Update from MarkJL: There is indeed a delay before the backends finish disconnecting.
Now me again: That being said, mind that the procpid column was renamed to pid in PostgreSQL 9.2 and later.
I think that this is much more helpful than the answer by Milen A. Radev which, while technically the same, does not come with usage examples and real-life suggestions.
I post my answer because I couldn't use any of them in my script, server 9.3:
psql -U postgres -c "SELECT pid, (SELECT pg_terminate_backend(pid)) as killed from pg_stat_activity WHERE datname = 'my_database_to_alter';"
In the next line, you can do anything yo want with 'my_database_to_alter'. As you can see, yo perform the query from the "postgres" database, which exists almost in every postgresql installation.
Doing by superuser and outside the problem-database itself worked perfect for me.
probably a more heavy handed approach then should be used but:
for x in `ps -eF | grep -E "postgres.*idle"| awk '{print $2}'`;do kill $x; done
I found this thread on the mailing list. It suggests using SIGTERM to cause the clients to disconnect.
Not as clean as db2 force application all.