Table lock upon SET TABLESPACE in PostgreSQL 9.1? - postgresql

What type of table lock is acquired when you move a table (or rather a partition) from one tablespace to another in PostgreSQL 9.1?
Should I execute NO INHERIT first to detach it from the master table?

That will take an ACCESS EXCLUSIVE lock on the table (and its toast table and toast index, if they exist).
It does not matter if the table inherits from another table or not.
If the table has any indexes and you want to move those too, you'll have to explicitly move them with ALTER INDEX ... SET TABLESPACE ....

Related

How do we map the logical table to the tablespace?

How do we map the logical table to the tablespace?
Does Postgres automatically creates tablespaces when we create a table or a schema?
Tablespaces and schemas are independent from each other: the first concerns the physical placement of the table, the second the namespace.
PostgreSQL automatically creates a tablespace: it is called pg_default and is located in the base subdirectory of the data directory.
All tables you create are placed there unless:
You use the TABLESPACE clause of CREATE TABLE to place it elsewhere.
The table is in a database with a different tablespace.
You set the default_tablespace parameter to a different tablespace.
Tablespaces are rarely needed, and usually it is best to keep all tables in the default tablespace.

Will making column nullable lock the table for reads?

I want to make one field in my database nullable.
ALTER TABLE answers ALTER COLUMN author_id DROP NOT NULL;
The problem is that I have a lot of data in the database and I'm not sure if the alter command will block the table for write operations.
I know that changing the size or type of the column will cause the row exclusive lock https://www.postgresql.org/docs/current/static/explicit-locking.html#LOCKING-TABLES.
Will the alter I lock the table for writes too or it's safe to use?
Yes, the ALTER will require an exclusive lock on the table (including blocking write access to the table)
However, the duration of the lock is extremely short and does not depend on the size of the table. It is essentially only an update the internal system tables.

Best practices for performing a table swap in Redshift

We're in the process of running a handful of hourly scripts on our Redshift cluster which build summary tables for data consumers. After assembling a staging table, the script then runs a transaction which deletes the existing table and replaces it with the staging table, as such:
BEGIN;
DROP TABLE IF EXISTS public.data_facts;
ALTER TABLE public.data_facts_stage RENAME TO data_facts;
COMMIT;
The problem with this operation is that long-running analysis queries will place an AccessShareLock on public.data_facts, preventing it from being dropped and thrashing our ETL cycle. I'm thinking a better solution would be one which renames the existing table, as such:
ALTER TABLE public.data_facts RENAME TO data_facts_old;
ALTER TABLE public.data_facts_stage RENAME TO data_facts;
DROP TABLE public.data_facts_old;
However, this approach presupposes that 1) public.data_facts exists, and 2) public.data_facts_old does not exist.
Do you know if there's a way to conduct this operation safely in SQL, without relying on application logic? (eg. something like ALTER TABLE IF EXISTS).
I haven't tried it but looking at the documentation of CREATE VIEW it seems that this can be done with late-binding views.
The main idea would be a view public.data_facts that users interact with. Behind the scenes, you can load new data and then swap the view to “point” to the new table.
Bootstrap
-- load data into public.data_facts_v0
CREATE VIEW public.data_facts AS
SELECT * from public.data_facts_v0 WITH NO SCHEMA BINDING;
Update
-- load data into public.data_facts_v1
CREATE OR REPLACE VIEW public.data_facts AS
SELECT * from public.data_facts_v1 WITH NO SCHEMA BINDING;
DROP TABLE public.data_facts_v0;
The WITH NO SCHEMA BINDING means the view will be late-binding. “A late-binding view doesn't check the underlying database objects, such as tables and other views, until the view is queried.” This means the update can even introduce a table with renamed columns or a completely new structure.
Notes:
It might be a good idea to wrap the swap operations into a transaction to make sure we don't drop the previous table if the VIEW swap failed.
You can add a new load time timestamp encode runlength default getdate() column to your target table, and make your ETL do this:
INSERT INTO public.data_facts
SELECT * FROM public.data_facts_staging;
DELETE FROM public.data_facts
WHERE load_time<(select max(load_time) from public.data_facts);
DROP TABLE public.data_facts_staging;
note: public.data_facts_staging should have exactly the same structure as public.data_facts except that the last column of public.data_facts is load_time, so that on insert it will be populated with the current timestamp.
The only implication is that it would require extra disk space for a moment between you insert new rows and delete the old rows, and load_time has to be always the last column. Also you have to vaccum table every time you do this.
Another good thing about this is that if your ETL fails and staging table is empty or there is no staging table you won't lose your data. In the pure SQL scenario of swapping tables with DDL you're not protected from dropping the target table when staging table is missing. In the suggested scenario if no new rows are inserted the delete statement deletes nothing (there are no rows less than max load time), so worst case is just having the old version of data.
p.s. there is a command that instead of insert ... select ... just changes the pointer from staging to target table (alter table ... append from ...) but it requires the same type of lock as alter table I guess, so I don't suggest this

How a Table space for a query execution is chosen? - DB2

For one of my issue, I am trying to understand the functionalities of below DB2 entities,
System Temporary table space.
Page Size.
Table Space.
Buffer Pool.
And below is my observation,
There is a TableSpace linked to a table in the DB2 table syscat.tables
The TableSpaces are linked to BufferPool with the relation defined in syscat.tablespaces
System Temporary table space is the table space that the DB might use while executing the query.
Page Size is an unit that defines the limit of a TableSpace and says how much data can a TableSpace can hold.
Is there something wrong in my above understandings? And when I excute a query how does the DB chooses which TableSpace to choose?

PostgreSQL - disabling constraints

I have a table with approx 5 million rows which has a fk constraint referencing the primary key of another table (also approx 5 million rows).
I need to delete about 75000 rows from both tables. I know that if I try doing this with the fk constraint enabled it's going to take an unacceptable amount of time.
Coming from an Oracle background my first thought was to disable the constraint, do the delete & then reenable the constraint. PostGres appears to let me disable constraint triggers if I am a super user (I'm not, but I am logging in as the user that owns/created the objects) but that doesn't seem to be quite what I want.
The other option is to drop the constraint and then reinstate it. I'm worried that rebuilding the constraint is going to take ages given the size of my tables.
Any thoughts?
edit: after Billy's encouragement I've tried doing the delete without changing any constraints and it takes in excess of 10 minutes. However, I have discovered that the table from which I'm trying to delete has a self referential foreign key ... duplicated (& non indexed).
Final update - I dropped the self referential foreign key, did my delete and added it back in. Billy's right all round but unfortunately I can't accept his comment as the answer!
Per previous comments, it should be a problem. That said, there is a command that may be what you're looking to - it'll set the constraints to deferred so they're checked on COMMIT, not on every delete. If you're doing just one big DELETE of all the rows, it won't make a difference, but if you're doing it in pieces, it will.
SET CONSTRAINTS ALL DEFERRED
is what you are looking for in that case. Note that constraints must be marked as DEFERRABLE before they can be deferred. For example:
ALTER TABLE table_name
ADD CONSTRAINT constraint_uk UNIQUE(column_1, column_2)
DEFERRABLE INITIALLY IMMEDIATE;
The constraint can then be deferred in a transaction or function as follows:
CREATE OR REPLACE FUNCTION f() RETURNS void AS
$BODY$
BEGIN
SET CONSTRAINTS ALL DEFERRED;
-- Code that temporarily violates the constraint...
-- UPDATE table_name ...
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
What worked for me was to disable one by one the TRIGGERS of those tables that are gonna be involved in the DELETE operation.
ALTER TABLE reference DISABLE TRIGGER ALL;
DELETE FROM reference WHERE refered_id > 1;
ALTER TABLE reference ENABLE TRIGGER ALL;
Solution is working in version 9.3.16. In my case time went from 45 minutes to 14 seconds executing DELETE operations.
As stated in the comments section by #amphetamachine, you will need to have admin privileges to the tables to perform this task.
If you try DISABLE TRIGGER ALL and get an error like permission denied: "RI_ConstraintTrigger_a_16428" is a system trigger (I got this on Amazon RDS), try this:
set session_replication_role to replica;
If this succeeds, all triggers that underlie table constraints will be disabled. Now it's up to you to make sure your changes leave the DB in a consistent state!
Then when you are done, reenable triggers & constraints for your session with:
set session_replication_role to default;
(This answer assumes your intent is to delete all of the rows of these tables, not just a selection.)
I also had to do this, but as part of a test suite. I found the answer, suggested elsewhere on SO. Use TRUNCATE TABLE as follows:
TRUNCATE TABLE <list-of-table-names> [RESTART IDENTITY] [CASCADE];
The following quickly deletes all rows from tables table1, table2, and table3, provided that there are no references to rows of these tables from tables not listed:
TRUNCATE TABLE table1, table2, table3;
As long as references are between the tables listed, PostgreSQL will delete all the rows without concern for referential integrity. If a table other than those listed references a row of one of these tables, the query will fail.
However, you can qualify the query so that it also truncates all tables with references to the listed tables (although I have not tried this):
TRUNCATE TABLE table1, table2, table3 CASCADE;
By default, the sequences of these tables do not restart numbering. New rows will continue with the next number of the sequence. To restart sequence numbering:
TRUNCATE TABLE table1, table2, table3 RESTART IDENTITY;
My PostgreSQL is 9.6.8.
set session_replication_role to replica;
work for me but I need permission.
I login psql with super user.
sudo -u postgres psql
Then connect to my database
\c myDB
And run:
set session_replication_role to replica;
Now I can delete from table with constraint.
Disable all table constraints
ALTER TABLE TableName NOCHECK CONSTRAINT ConstraintName
-- Enable all table constraints
ALTER TABLE TableName CHECK CONSTRAINT ConstraintName