Change column type VARCHAR to TEXT in PostgreSQL without lock table - postgresql

I have a "Parent Table" and partition table by year with a lot column and now I need change a column VARCHAR(32) to TEXT because we need more length flexibility.
So I will alter the parent them will also change all partition.
But the table have 2 unique index with this column and also 1 index.
This query lock the table:
ALTER TABLE my_schema.my_table
ALTER COLUMN column_need_change TYPE VARCHAR(64) USING
column_need_change :: VARCHAR(64);
Also this one :
ALTER TABLE my_schema.my_table
ALTER COLUMN column_need_change TYPE TEXT USING column_need_change :: TEXT;
I see this solution :
UPDATE pg_attribute SET atttypmod = 64+4
WHERE attrelid = 'my_schema.my_table'::regclass
AND attname = 'column_need_change ';
But I dislike this solution.
How can change VARCHAR(32) type to TEXT without lock table, I need continue to push some data in table between the update.
My Postgresql version : 9.6
EDIT :
This is the solution I ended up taking:
ALTER TABLE my_schema.my_table
ALTER COLUMN column_need_change TYPE TEXT USING column_need_change :: TEXT;
The query lock my table between : 1m 52s 548ms for 2.6 millions rows but it's fine.

The supported and safe variant is to use ALTER TABLE. This will not rewrite the table, since varchar and text have the same on-disk representation, so it will be done in a split second once the ACCESS EXCLUSIVE table lock is obtained.
Provided that your transactions are short, you will only experience a short stall while ALTER TABLE waits for all prior transactions to finish.
Messing with the system catalogs is dangerous, and you do so on your own risk.
You might get away with
UPDATE pg_attribute
SET atttypmod = -1,
atttypid = 25
WHERE attrelid = 'my_schema.my_table'::regclass
AND attname = 'column_need_change';
But if it breaks something, you get to keep the pieces…

Related

How to convert PostgreSQL 12 generated column to a normal column?

I have a generated column in PostgreSQL 12 defined as
create table people (
id bigserial primary key,
a varchar,
b boolean generated always as (a is not null) stored
);
but now i want column b to be settable but i don't want to lose the data already in the column, i could drop the column and recreate it but that would lose the current data.
Thanks In Advance
You can run several ALTER TABLE statements in a transaction:
BEGIN;
ALTER TABLE people ADD b_new boolean;
UPDATE people SET b_new = b;
ALTER TABLE people DROP b;
ALTER TABLE people RENAME b_new TO b;
COMMIT;
alter table people add column temp_data boolean;
update people set temp_data=b --(copy data from column b to temp_data)
Do whatever you want with column "b".
update people set b=temp_data --(move data back)
alter table people drop column temp_data --(optional)

Does ALTER COLUMN TYPE varchar(N) rewrite the table in Postgres 9.6?

In the past
The way we handled this with Postgres 8.4 was to manually update the pg_attribute table:
LOCK TABLE pg_attribute IN EXCLUSIVE MODE;
UPDATE pg_attribute SET atttypmod = 104
WHERE attrelid = 'table_name'::regclass AND
attname = 'column_name';
column_name was a varchar(50) and we wanted a varchar(100), but the table was too enormous (tens of millions of rows) and too heavily used to rewrite.
Nowadays
The content and answers around this topic were sparse and outdated for such a (at least anecdotally) common problem.
But, after seeing hints that this might be the case on at least 3 discussions, I've come to think that with newer versions of Postgres (we're on 9.6), you are now able to run the following:
ALTER TABLE 'table_name' ALTER COLUMN 'column_name' TYPE varchar(100);
...without rewriting the table.
Is this correct?
If so, do you know where some definitive info on the topic exists in the Postgres docs?
That ALTER TABLE will not require a rewrite.
The documentation says:
Adding a column with a DEFAULT clause or changing the type of an existing column will require the entire table and its indexes to be rewritten.
It is very simple to test:
Try with an empty table and see if the relfilenode column in the pg_class row for the table changes:
SELECT relfilenode FROM pg_class
WHERE relname = 'table_name';
Reading on in the documentation, you see:
As an exception, when changing the type of an existing column, if the USING clause does not change the column contents and the old type is either binary coercible to the new type or an unconstrained domain over the new type, a table rewrite is not needed; but any indexes on the affected columns must still be rebuilt.
Since varchar(50) is clearly binary coercible to varchar(100), your case will not require a table rewrite, as the above test should confirm.
According to What's new in PostgreSQL 9.2 the above answer is at least strange to me the accepted answer was edited to align with the below:
Reduce ALTER TABLE rewrites
A table won't get rewritten anymore during
an ALTER TABLE when changing the type of a column in the following
cases:
varchar(x) to varchar(y) when y>=x. It works too if going from
varchar(x) to varchar or text (no size limitation)
I tested with postgres 10.4 and the relfilenode remained the same after running alter table ... alter column ... type varchar(50)
create table aaa (field varchar(10));
insert into aaa select f from generate_series(1,1e6) f;
commit;
SELECT oid, relfilenode FROM pg_class WHERE relname = 'aaa';
alter table aaa alter column field type varchar(50);
commit;
SELECT oid, relfilenode FROM pg_class WHERE relname = 'aaa';
I'm not sure why you got a different relfilenode in 9.6 (or I'm missing something...).

PostgreSQL ADD COLUMN DEFAULT NULL locks and performance

I have a table in my PostgreSQL 9.6 database with 3 million rows. This table already has a null bitmap (it has 2 other DEFAULT NULL fields). I want to add a new boolean nullable column to this table. I stuck with the difference between these two statements:
ALTER TABLE my_table ADD COLUMN my_column BOOLEAN;
ALTER TABLE my_table ADD COLUMN my_column BOOLEAN DEFAULT NULL;
I think that these statements have no difference, but:
I can't find any proof of it in documentation. Documentation tells that providing DEFAULT value for the new column makes PostgreSQL to rewrite all the tuples, but I don't think that it's true for this case, cause default value is NULL.
I ran some tests on copy of this table, and the first statement (without DEFAULT NULL) took a little bit more time than the second. I can't understand why.
My questions are:
Will PostgreSQL use the same lock type (ACCESS EXCLUSIVE) for those two statements?
Will PostgreSQL rewrite all tuples to add NULL value to every of them in case that I use DEFAULT NULL?
Are there any difference between those two statements?
There's a issue in the response of Vao Tsun in point 2.
If you use ALTER TABLE my_table ADD COLUMN my_column BOOLEAN; it won't rewrite all the tuples, it will be just a change in the metadata.
But if you use ALTER TABLE my_table ADD COLUMN my_column BOOLEAN DEFAULT NULL, it will rewrite all the tuples, and it will last for ever on long tables.
The documentation itself tells this.
When a column is added with ADD COLUMN, all existing rows in the table are initialized with the column's default value (NULL if no DEFAULT clause is specified). If there is no DEFAULT clause, this is merely a metadata change and does not require any immediate update of the table's data; the added NULL values are supplied on readout, instead.
This tell us that if there is a DEFAULT clause, even if it is NULL, it will rewrite all the tuples.
This is due to a performance issue on the updates clause. If you need to make an update over a no rewrited tuple, it will need to move the tuple to another disk space, consuming more time.
I tested this by my own on Postgresql 9.6, when i had to add a column, on a table that had 300+ million tuples. Without the DEFAULT NULL it lasted 11 ms, and with the DEFAULT NULL it lasted more than 30 minutes.
https://www.postgresql.org/docs/current/static/sql-altertable.html
Yes - same ACCESS EXCLUSIVE, no exceptions for DEFAULT NULL or no DEFAULT mentionned (statistics, "options", constraints, cluster would require less strict I think, but not add column)
Note that the lock level required may differ for each subform. An
ACCESS EXCLUSIVE lock is held unless explicitly noted. When multiple
subcommands are listed, the lock held will be the strictest one
required from any subcommand.
No - it will rather append NULL to result on select
When a column is added with ADD COLUMN, all existing rows in the table
are initialized with the column's default value (NULL if no DEFAULT
clause is specified). If there is no DEFAULT clause, this is merely a
metadata change and does not require any immediate update of the
table's data; the added NULL values are supplied on readout, instead.
No - no difference AFAIK. Just metadata change in both cases (as I believe it is one case expressed with different semantics)
Edit - Demo:
db=# create table so(i int);
CREATE TABLE
Time: 9.498 ms
db=# insert into so select generate_series(1,10*1000*1000);
INSERT 0 10000000
Time: 13899.190 ms
db=# alter table so add column nd BOOLEAN;
ALTER TABLE
Time: 1025.178 ms
db=# alter table so add column dn BOOLEAN default null;
ALTER TABLE
Time: 13.849 ms
db=# alter table so add column dnn BOOLEAN default true;
ALTER TABLE
Time: 14988.450 ms
db=# select version();
version
----------------------------------------------------------------------------------------------------------------
PostgreSQL 9.6.1 on x86_64-apple-darwin15.6.0, compiled by Apple LLVM version 8.0.0 (clang-800.0.42.1), 64-bit
(1 row)
lastly to avoid speculations it is data type specific:
db=# alter table so add column t text;
ALTER TABLE
Time: 25.831 ms
db=# alter table so add column tn text default null;
ALTER TABLE
Time: 13.798 ms
db=# alter table so add column tnn text default 'null';
ALTER TABLE
Time: 15440.318 ms

Postgres alter field type from float4 to float8 on huge table

I want to alter column data type from float4 to float8 on a table with huge rows count. If I do it in usual path it takes much time and my table blocked for this time.
IS any hack to do it without rewrite the table content?
ALTER TABLE ... ALTER COLUMN ... TYPE ... USING ... (or related things like ALTER TABLE ... ADD COLUMN ... DEFAULT ... NOT NULL) requires a full table rewrite with an exclusive lock.
You can, with a bit of effort, work around this in steps:
ALTER TABLE thetable ADD COLUMN thecol_tmp newtype without NOT NULL.
Create a trigger on the table that, for every write to thecol, updates thecol_tmp as well, so new rows that're created, and rows that're updated, get a value for newcol_tmp as well as newcol.
In batches by ID range, UPDATE thetable SET thecol_tmp = CAST(thecol AS newtype) WHERE id BETWEEN .. AND ..
once all values are populated in thecol_tmp, ALTER TABLE thetable ALTER COLUMN thecol_tmp SET NOT NULL;.
Now swap the columns and drop the trigger in a single tx:
BEGIN;
ALTER TABLE thetable DROP COLUMN thecol;
ALTER TABLE thetable RENAME COLUMN thecol_tmp TO thecol;
DROP TRIGGER whatever_trigger_name ON thetable;
COMMIT;
Ideally we'd have an ALTER TABLE ... ALTER COLUMN ... CONCURRENTLY that did this within PostgreSQL, but nobody's implemented that. Yet.

Alter column set not null fails

Consider the following table with approximately 10M rows
CREATE TABLE user
(
id bigint NOT NULL,
...
CONSTRAINT user_pk PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
)
Then i applied the following alter
ALTER TABLE USER ADD COLUMN BUSINESS_ID VARCHAR2(50);
--OK
UPDATE USER SET BUSINESS_ID = ID; //~1500 sec
--OK
ALTER TABLE USER ALTER COLUMN BUSINESS_ID SET NOT NULL;
ERROR: column "business_id" contains null values
SQL state: 23502
This is very strange since id column (which has been copied to business_id column) can't contain null values since it is the primary key, but to be sure i check it
select count(*) from USER where BUSINESS_ID is null
--0 records
I suspect that this is a bug, just wondering if i am missing something trivial
The only logical explanation would be a concurrent INSERT.
(Using tbl instead of the reserved word user as table name.)
ALTER TABLE tbl ADD COLUMN BUSINESS_ID VARCHAR2(50);
--OK
UPDATE tbl SET BUSINESS_ID = ID; //~1500 sec
--OK
-- concurrent INSERT HERE !!!
ALTER TABLE tbl ALTER COLUMN BUSINESS_ID SET NOT NULL;</code></pre>
To prevent this, use instead:
ALTER TABLE tbl
ADD COLUMN BUSINESS_ID VARCHAR(50) DEFAULT ''; -- or whatever is appropriate
...
You may end up with a default value in some rows. You might want to check.
Or run everything as transaction block:
BEGIN;
-- LOCK tbl; -- not needed
ALTER ...
UPDATE ...
ALTER ...
COMMIT;
You might take an exclusive lock to be sure, but ALTER TABLE .. ADD COLUMN takes an ACCESS EXCLUSIVE lock anyway. (Which is only released at the end of the transaction, like all locks.)
Maybe it wants a default value? Postgresql docs on ALTER:
To add a column, use a command like this:
ALTER TABLE products ADD COLUMN description text;
The new column is initially filled with whatever default value is given (null if you don't specify a DEFAULT clause).
So,
ALTER TABLE USER ALTER COLUMN BUSINESS_ID SET DEFAULT="",
ALTER COLUMN BUSINESS_ID SET NOT NULL;
You cannot do that at the same transaction. Add your column and update it. Then in a separate transaction set the not null constraint.