close client before finishing command postgresql - postgresql

Let's say that I would like to create an index. It will take some time. I use pgadmin. Let's say that when query is being executed pgadmin crashed for some reason (for example computer is restarted).
What would be the status of that index. Would it keep on being created and finally at some point it would be created with success or it will fail immedately or it will fail aftger some time?
Is there any way to check what is the status of index being created? (Im using postgres version 10.x)

It is hard to answer the question if we do not know the reason for the crash of an application. If it was not caused by a server failure it is likely that the index was created correctly. You can check this by querying the system catalog pg_index. You have to know the index name, e.g.:
select indexrelid::regclass, indisvalid
from pg_index
where indexrelid::regclass::text = 'my_table_unique_col_key'
Per the documantaion:
indisvalid - If true, the index is currently valid for queries. False means the index is possibly incomplete: it must still be modified by INSERT/UPDATE operations, but it cannot safely be used for queries. If it is unique, the uniqueness property is not guaranteed true either.
If the index is not created yet (or at all due to a query failure) the above query returns no rows. You can check whether the query that creates the index is still running (see Dynamic Statistics Views):
select *
from pg_stat_activity
where query ilike 'create index%'

Related

Postgresql row level security does not throw errors

I have a Postgresql DB that I want to enable the Row-Level-Security on one of its tables.
Everything is working fine, except one thing, that is I want to have an error to be thrown when a user try to perform an update on a record that he doesn't have privileges on.
According to the docs:
check_expression: Any SQL conditional expression (returning boolean).
The conditional expression cannot contain any aggregate or window
functions. This expression will be used in INSERT and UPDATE queries
against the table if row level security is enabled. Only rows for
which the expression evaluates to true will be allowed. An error will
be thrown if the expression evaluates to false or null for any of the
records inserted or any of the records that result from the update.
Note that the check_expression is evaluated against the proposed new
contents of the row, not the original contents.
So I tried the following:
CREATE POLICY update_policy ON my_table FOR UPDATE TO editors
USING (has_edit_privilege(user_name))
WITH CHECK (has_edit_privilege(user_name));
I have also another policy for SELECT
CREATE POLICY select_policy ON my_table FOR SELECT TO editors
USING (has_select_privilege(user_name));
According to the docs, this should create a policy that would prevent any one from the editors ROLE to perform update on any record of my_table, and would throw an error when an update is performed. This works correctly, but no error is thrown.
What's my problem?
Please help.
First, let me explain how row level security works when reading from the table:
You need not even have to define the policy – if there is no policy for a user on a table with row level security, the default is that the user can see nothing.
No error will be thrown when reading from a table.
If you want an error to be thrown, you could write a function
CREATE FUNCTION test_row(my_table) RETURNS boolean
LANGUAGE plpgsql COST 10000 AS
$$BEGIN
IF /* the user is not allowed */
THEN
RAISE EXCEPTION ...;
END IF;
RETURN TRUE;
$$END;$$;
Then use that function in your policy:
CREATE POLICY update_policy ON my_table FOR UPDATE TO editors
USING (test_row(my_table));
I used COST 10000 for the function to tell PostgreSQL to test that condition after all other conditions, if possible.
This is not a fool-proof technique, but it will work for simple queries. What could happen in the general case is that some conditions get checked after the condition from the policy, and this could lead to errors with rows that wouldn't even be returned from the query.
But I think it is the best you can get when abusing the concept of row level security.
Now let me explain about writing to the table:
Any attempt to write a row to the table that does not satisfy the CHECK clause will cause an error as documented.
Now let's put it together:
Assuming that you define the policy like in your question:
Any INSERT will cause an error.
Any SELECT will return an empty result.
Any UPDATE or DELETE will be successful, but affect no row. This is because these operations have to read (scan) the table before they modify data, and that scan will return no rows like in the SELECT case. Since no rows are affected by the data modification, no error is thrown.

Postgres: SELECT FOR UPDATE does not see new rows after lock release

Trying to support PostgreSQL DB in my application, found this strange behaviour.
Preparation:
CREATE TABLE test(id INTEGER, flag BOOLEAN);
INSERT INTO test(id, flag) VALUES (1, true);
Assume two concurrent transactions (Autocommit=false, READ_COMMITTED) TX1 and TX2:
TX1:
UPDATE test SET flag = FALSE WHERE id = 1;
INSERT INTO test(id, flag) VALUES (2, TRUE);
-- (wait, no COMMIT yet)
TX2:
SELECT id FROM test WHERE flag=true FOR UPDATE;
-- waits for TX1 to release lock
Now, if I COMMIT in TX1, the SELECT in TX2 returns empty cursor.
It is strange to me, because same experiment in Oracle and MariaDB results in selecting newly created row (id=2).
I could not find anything about this behaviour in PG documentation.
Am I missing something?
Is there any way to force PG server to "refresh" statement visibility after acquiring lock?
PS: PostgreSQL version 11.1
TX2 scans the table and tries to lock the results.
The scan sees the snapshot of the database from the start of the query, so it cannot see any rows that were inserted (or made eligible in some other way) by concurrent modifications that started after that snapshot was taken.
That is why you cannot see the row with the id 2.
For id 1, that is also true, so the scan finds that row. But the query has to wait until the lock is released. When that finally happens, it fetches that latest committed version of the row and performs the check again, so that row is excluded as well.
This “EvalPlanQual” recheck (to use PostgreSQL jargon) is only performed for rows that were found during the scan, but were locked. The second row isn't even found during the scan, so no such processing happens there.
This is a bit odd, admitted. But it is not a bug, it is just the way PostgreSQL wirks.
If you want to avoid such anomalies, use the REPEATABLE READ isolation level. Then you will get a serialization error in such a case and can retry the transaction, thus avoiding inconsistencies like that.

Acquiring advisory locks in postgres

I think there must be something basic I'm not understanding about advisory locking in postgres. If I enter the following commands on the psql command line client, the function returns true both times:
SELECT pg_try_advisory_lock(20); --> true
SELECT pg_try_advisory_lock(20); --> true
I was expecting that the second command should return false, since the lock should already have been acquired. Oddly, I do get the following, suggesting that the lock has been acquired twice:
SELECT pg_advisory_unlock(20); --> true
SELECT pg_advisory_unlock(20); --> true
SELECT pg_advisory_unlock(20); --> false
So I guess my question is, how does one acquire an advisory lock in a way that stops it being acquired again?
What if you will try doing this from the 2 different PostgreSQL sessions?
Check out more in the docs.
My first impression on advisory locks was similar. I expected the second query (SELECT pg_tryadvisory_lock(20)) to return false too (because the first one got the lock). But this query only confirmed that a bigInt with value 20 has a lock. The interpretation is up to the user.
Imagine the advisory locks as a table where you can store a value and get a lock on that value (normally a BigInt). It is no explicit lock and no transction will be delayed. It depends on you how to interpret and use the result - and it is not blocking.
I use it in my projects with the two-integers-options. SELECT pg_try_advisory_lock(classId, objId) whereas both parameters are integers.
To make it work with more than a table just use the OID of the table as classId and the primary id (here 17) as objId:
SELECT pg_try_advisory_lock((SELECT 'first_table'::regclass::oid)::integer, 17);
In this example "first_table" is the name of the table and the second integer is the primary key id (here: 17).
Using a bigInt as parameter allows a wider range of ids, but if you use it with "second_table" than the id 17 is locked as well (because you locked the number "17" and not the relation to a specific row in a table).
It took me some time to figure that out, so hopefully it helps to understand the inner-workings of advisory locks.

DB2 deadlock timeout Sqlstate: 40001, reason code 68 due to update statements called from servlet using SQL

I am calling update statements one after the other from a servlet to DB2. I am getting error sqlstate 40001, reason code 68 which i found it is due to deadlock timeout.
How can I resolve this issue?
Can it be resolved by setting query timeout?
If yes then how to use it with update statements in servlet or where to use it?
The reason code 68 already tells you this is due to a lock timeout (deadlock is reason code 2) It could be due to other users running queries at the same time that use the same data you are accessing, or your own multiple updates.
Begin by running db2pd -db locktest -locks show detail from a db2 command line to see where the locks are. You'll then need to run something like:
select tabschema, tabname, tableid, tbspaceid
from syscat.tables where tbspaceid = # and tableid = #
filling in the # symbols with the ID number you get from the db2pd command output.
Once you see where the locks are, here are some tips:
◦Deadlock frequency can sometimes be reduced by ensuring that all applications access their common data in the same order – meaning, for example, that they access (and therefore lock) rows in Table A, followed by Table B, followed by Table C, and so on.
taken from: http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/topic/com.ibm.db2.luw.admin.trb.doc/doc/t0055074.html
recommended reading: http://www.ibm.com/developerworks/data/library/techarticle/dm-0511bond/index.html
Addendum: if your servlet or another guilty application is using select statements found to be involved in the deadlock, you can try appending with ur to the select statements if accuracy of the newly updated (or inserted) data isn't important.
For me, the solution was adding FOR READ ONLY WITH UR at the end of all my SELECT statements. (Apparently my select statements were returning so much data, it locked the tables long enough to interfere with other SQL statements)
See https://www.ibm.com/support/knowledgecenter/SSEPEK_10.0.0/sqlref/src/tpc/db2z_sql_isolationclause.html

Does Firebird need manual reindexing?

I use both Firebird embedded and Firebird Server, and from time to time I need to reindex the tables using a procedure like the following:
CREATE PROCEDURE MAINTENANCE_SELECTIVITY
ASDECLARE VARIABLE S VARCHAR(200);
BEGIN
FOR select RDB$INDEX_NAME FROM RDB$INDICES INTO :S DO
BEGIN
S = 'SET statistics INDEX ' || s || ';';
EXECUTE STATEMENT :s;
END
SUSPEND;
END
I guess this is normal using embedded, but is it really needed using a server? Is there a way to configure the server to do it automatically when required or periodically?
First, let me point out that I'm no Firebird expert, so I'm answering on the basis of how SQL Server works.
In that case, the answer is both yes, and no.
The indexes are of course updated on SQL Server, in the sense that if you insert a new row, all indexes for that table will contain that row, so it will be found. So basically, you don't need to keep reindexing the tables for that part to work. That's the "no" part.
The problem, however, is not with the index, but with the statistics. You're saying that you need to reindex the tables, but then you show code that manipulates statistics, and that's why I'm answering.
The short answer is that statistics goes slowly out of whack as time goes by. They might not deteriorate to a point where they're unusable, but they will deteriorate down from the perfect level they're in when you recreate/recalculate them. That's the "yes" part.
The main problem with stale statistics is that if the distribution of the keys in the indexes changes drastically, the statistics might not pick that up right away, and thus the query optimizer will pick the wrong indexes, based on the old, stale, statistics data it has on hand.
For instance, let's say one of your indexes has statistics that says that the keys are clumped together in one end of the value space (for instance, int-column with lots of 0's and 1's). Then you insert lots and lots of rows with values that make this index contain values spread out over the entire spectrum.
If you now do a query that uses a join from another table, on a column with low selectivity (also lots of 0's and 1's) against the table with this index of yours, the query optimizer might deduce that this index is good, since it will fetch many rows that will be used at the same time (they're on the same data page).
However, since the data has changed, it'll jump all over the index to find the relevant pieces, and thus not be so good after all.
After recalculating the statistics, the query optimizer might see that this index is sub-optimal for this query, and pick another index instead, which is more suited.
Basically, you need to recalculate the statistics periodically if your data is in flux. If your data rarely changes, you probably don't need to do it very often, but I would still add a maintenance job with some regularity that does this.
As for whether or not it is possible to ask Firebird to do it on its own, then again, I'm on thin ice, but I suspect there is. In SQL Server you can set up maintenance jobs that does this, on a schedule, and at the very least you should be able to kick off a batch file from the Windows scheduler to do something like it.
That does not reindex, it recomputes weights for indexes, which are used by optimizer to select most optimal index. You don't need to do that unless index size changes a lot. If you create the index before you add data, you need to do the recalculation.
Embedded and Server should have exactly same functionality apart the process model.
I wanted to update this answer for newer firebird. here is the updated dsql.
SET TERM ^ ;
CREATE OR ALTER PROCEDURE NEW_PROCEDURE
AS
DECLARE VARIABLE S VARCHAR(300);
begin
FOR select 'SET statistics INDEX ' || RDB$INDEX_NAME || ';'
FROM RDB$INDICES
WHERE RDB$INDEX_NAME <> 'PRIMARY' INTO :S
DO BEGIN
EXECUTE STATEMENT :s;
END
end^
SET TERM ; ^
GRANT EXECUTE ON PROCEDURE NEW_PROCEDURE TO SYSDBA;