PostgreSQL - REINDEX still working even after two hours - postgresql

I have started REINDEX on my PostgreSQL database. It can be visible in GUI that it processed a number of tables, and then stop responding. It looks like it is still working, even after two hours. The GUI is not responsive and its last row says: "NOTICE: table public.res_request_history" was reindexed."
Can I safely stop REINDEX? What can I do to actually make REINDEX work?
Thanks.

Yes, you can use pg_cancel_backend(pid). PID you can find executing 'select pg_stat_activity()'.
For example:
--Will display running queries and corresponding pid
SELECT query, pid FROM pg_stat_activity;
--You can then cancel one of them by calling this method with its pid
SELECT pg_cancel_backend(<pid>);

Related

close client before finishing command postgresql

Let's say that I would like to create an index. It will take some time. I use pgadmin. Let's say that when query is being executed pgadmin crashed for some reason (for example computer is restarted).
What would be the status of that index. Would it keep on being created and finally at some point it would be created with success or it will fail immedately or it will fail aftger some time?
Is there any way to check what is the status of index being created? (Im using postgres version 10.x)
It is hard to answer the question if we do not know the reason for the crash of an application. If it was not caused by a server failure it is likely that the index was created correctly. You can check this by querying the system catalog pg_index. You have to know the index name, e.g.:
select indexrelid::regclass, indisvalid
from pg_index
where indexrelid::regclass::text = 'my_table_unique_col_key'
Per the documantaion:
indisvalid - If true, the index is currently valid for queries. False means the index is possibly incomplete: it must still be modified by INSERT/UPDATE operations, but it cannot safely be used for queries. If it is unique, the uniqueness property is not guaranteed true either.
If the index is not created yet (or at all due to a query failure) the above query returns no rows. You can check whether the query that creates the index is still running (see Dynamic Statistics Views):
select *
from pg_stat_activity
where query ilike 'create index%'

Drop index locks

My Postgres version is 9.6
I tried today to drop an index from my DB with this command:
drop index index_name;
And it caused a lot of locks - the whole application was stuck until I killed all the sessions of the drop (why it was devided to several sessions?).
When I checked the locks I saw that almost all the blocked sessions execute that query:
SELECT a.attname, format_type(a.atttypid, a.atttypmod),
pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod
FROM
pg_attribute a LEFT JOIN pg_attrdef d
ON a.attrelid = d.adrelid AND a.attnum = d.adnum
WHERE a.attrelid = <index_able_name>::regclass
AND a.attnum > 0 AND NOT a.attisdropped
ORDER BY a.attnum;
Is that make sense that this will block system actions?
So I decided to drop the index with concurrently option to prevent locks.
drop index concurrently index_name;
I execute it now from PGAdmin (because you can't run it from noraml transaction).
It run over that 20 minutes and didnt finished yet. Index size is 20MB+-.
And when I'm checking the DB for locks I see that there is a select query on that table and that's blocks the drop command.
But when I took this select and execute in another session - this was vary fast (2-3) seconds.
So why is that blocking my drop? is there another option to do that? maybe to disable index instead?
drop index and drop index concurrently are usually very fast commands, but they both, as all DDL commands, require exclusive access to the table.
They differ only in how they try to achieve this exclusive access. Plain drop index would simply request exclusive lock on the table. This would block all queries (even selects) that try to use the table after the start of the drop query. It will do this until it will get the exclusive lock - when all transactions touching the table in any way, which started before the drop command, would finish and the transaction with a drop is committed. This explains why your application stopped working.
The concurrently version also needs brief exclusive access. But it works differently - it will not block the table, but wait until there no other query touching it and then does its (usually brief) work. But if the table is constantly busy it will never find such a moment, and wait for it infinitely. Also I suppose it just tries to lock the table repeatedly every X milliseconds until it succeeds, so a later parallel execution can be more lucky and finish faster.
If you see multiple simultaneous sessions trying to drop an index, and you do not expect that, then you have a bug in your application. The database would never do this on its own.

SELECT vs CREATE TABLE AS SELECT execution time

My function should return a TABLE which is created by lots of joins and is relatively "big".
If inside of my function i put return query select <complex query goes here>; then it takes ages (more like 10-15 mins) to run.
However, if instead of returning a TABLE, I return VOID and simply create a table within function body - it finished under 1 min.
The same goes for running this "complex query" as select <complex query goes here> VS create table <table name> as select <complex query goes here> and then select * from <table_name>.
Why is there such a difference in execution time?
P.S. The select clause of the query has around 35 columns with some logic inside.
P.P.S. The query returns only about 90K rows, so I doubt that it is the time that takes to send the data over the network
answer
select differs from create table as select in manner where you use the data, first will send data to the client and the latter will save it to disk server side.
why
Possible reasons could be slow link, and "feature" of the client. According to the fact that local psql running \copy (select * from) to 'local_file' took 3 seconds and yet PgAdmin took ages to display sam data, I assume you version PgAdmin (or any version at all) is not meant for your amount of data to display (as you say 36MB). So it was not the link, but the client.

Transaction time out workaround for PostgreSQL

AFAIK, PostgreSQL 8.3 does not support transaction time out. I've read about supporting this feature in the future and there's some discussion about it. However, for specific reasons, I need a solution for this problem. So what I did is a script that runs periodically:
1) Based on locks and activity, query in order to retrieve processID of the transactions that is taking too long, and keeping the oldest (trxTimeOut.sql):
SELECT procpid
FROM
(
SELECT DISTINCT age(now(), query_start) AS age, procpid
FROM pg_stat_activity, pg_locks
WHERE pg_locks.pid = pg_stat_activity.procpid
) AS foo
WHERE age > '30 seconds'
ORDER BY age DESC
LIMIT 1
2) Based on this query, kill the corresponding process (trxTimeOut.sh):
psql -h localhost -U postgres -t -d test_database -f trxTimeOut.sql | xargs kill
Although I've tested it and seems to work, I'd like to know if it's an acceptable approach or should I consider a different one?
PostgreSQL provides idle_in_transaction_session_timeout since version 9.6, to automatically terminate transactions that are idle for too long.
It's also possible to set a limit on how long a command can take, through statement_timeout, independently on the duration of the transaction it's in, or why it's stuck (busy query or waiting for a lock).
To auto-abort transactions that are stuck specifically waiting for a lock, see lock_timeout.
These settings can be set at the SQL level with commands like SET shown below, or can be set as defaults to a database with ALTER DATABASE, or to a user with ALTER USER, or to the entire instance through postgresql.conf.
SET statement_timeout=10000; -- time out after 10 seconds

DB2 deadlock timeout Sqlstate: 40001, reason code 68 due to update statements called from servlet using SQL

I am calling update statements one after the other from a servlet to DB2. I am getting error sqlstate 40001, reason code 68 which i found it is due to deadlock timeout.
How can I resolve this issue?
Can it be resolved by setting query timeout?
If yes then how to use it with update statements in servlet or where to use it?
The reason code 68 already tells you this is due to a lock timeout (deadlock is reason code 2) It could be due to other users running queries at the same time that use the same data you are accessing, or your own multiple updates.
Begin by running db2pd -db locktest -locks show detail from a db2 command line to see where the locks are. You'll then need to run something like:
select tabschema, tabname, tableid, tbspaceid
from syscat.tables where tbspaceid = # and tableid = #
filling in the # symbols with the ID number you get from the db2pd command output.
Once you see where the locks are, here are some tips:
◦Deadlock frequency can sometimes be reduced by ensuring that all applications access their common data in the same order – meaning, for example, that they access (and therefore lock) rows in Table A, followed by Table B, followed by Table C, and so on.
taken from: http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/topic/com.ibm.db2.luw.admin.trb.doc/doc/t0055074.html
recommended reading: http://www.ibm.com/developerworks/data/library/techarticle/dm-0511bond/index.html
Addendum: if your servlet or another guilty application is using select statements found to be involved in the deadlock, you can try appending with ur to the select statements if accuracy of the newly updated (or inserted) data isn't important.
For me, the solution was adding FOR READ ONLY WITH UR at the end of all my SELECT statements. (Apparently my select statements were returning so much data, it locked the tables long enough to interfere with other SQL statements)
See https://www.ibm.com/support/knowledgecenter/SSEPEK_10.0.0/sqlref/src/tpc/db2z_sql_isolationclause.html