Related
I've got a remedial question regarding sequences. I've used them a bit and consulted the docs, and am hoping that this is an easy question for the group. We're on Postgres 11.4 now, and will move to PG 12 whenever it's available on RDS.
The goal is to have a set of numbers that increase every time a row is inserted or updated. We use the field name "con_id" (concurrency ID) for this kind of counter. So the first time a row is inserted in an empty table, the value is 1, the second row gets 2, etc. Sounds like a SEQUENCE. I had a standard sequence in this role, then switched to AS IDENTITY...but realize now that was probably a mistake.
On update, the counter should keep working. So, if the first row is updated, the con_id changes from 1 to 3, the current max()+1. For the record, all of our updates use ON CONFLICT(id) SET, not just a straight UPDATE.
The point of the number series is to define start-stop bounds for various operations:
operation last_number_processed
sync_to_domo 124556
rollup_day 123516
rollup_week 103456
Then, when it's time to perform one of these operations, all you need to do to find the right chunk of records is select con_id from last_number_processed+1 through max(con_id). You have to update the operation tracker with that max(con_id) once the operation completes.
select max(con_id) from source_data; -- Get the current highest number.
The range is then something like 124557-128923 for "sync_to_domo".
Uniqueness isn't required here, although it is desirable. Gaps don't matter at all. Keeping the numbers in sequence is essential.
This kind of operation on update is the sort of thing that could easily be a horrific bottleneck, if I botched it. Can anyone suggest the best reliable, low-contention strategy for maintaining a counter that gets the max value from the table +1 on each insert or update?
And, yes, a timestamptz can be used for this purpose, it's just another kind of number line. The reason for an integer is to match how we code other systems. It's just easier to explain and reason about this stuff when the data type remains the same across platforms. Or so it seems, in this case.
Test Code
I'm adding some test code and results to my original question. This works, but I doubt it's super efficient. The test below is a minimal version, and not useful...I'm just trying to determine if I can get an ever-increasing number line of insertions and revisions. This effort looks okay on a quick check, but there's so much that I do not have internalized about Postgres' approach to contention, that I'm including it mostly so that people can tell me why it's terrible. So, please do ;-)
The setup is to have a sequence that's assigned automatically on INSERT and via per ROW trigger on UPDATE. Sounds less than idea, is there a better way? It works in a single-connection test, but double-increments the counters on UPDATE. That's not a problem for me, but I do not understand why it is happening.
Here's the stand-alone test code:
DROP TABLE IF EXISTS data.test_con_ids;
DROP SEQUENCE IF EXISTS test_con_ids_sequence;
DROP FUNCTION IF EXISTS data.test_con_ids_update;
BEGIN;
CREATE SEQUENCE data.test_con_ids_sequence
AS bigint;
CREATE TABLE data.test_con_ids (
id integer NOT NULL DEFAULT NULL PRIMARY KEY,
con_id bigint NOT NULL DEFAULT NEXTVAL ('test_con_ids_sequence'),
dts timestamptz default NOW()
);
CREATE OR REPLACE FUNCTION data.test_con_ids_update()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
BEGIN
NEW.con_id := NEXTVAL ('test_con_ids_sequence');
RETURN NEW;
END
$function$;
-- It's late here, not sure if I could use a FOR EACH STATEMENT trigger here, which should be faster. If it would work.
CREATE TRIGGER test_con_ids_update_trigger BEFORE UPDATE ON data.test_con_ids
FOR EACH ROW EXECUTE PROCEDURE test_con_ids_update();
-- Add ten records, IDs 1-10, con_ids 1-10.
INSERT INTO data.test_con_ids (id)
VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10);
-- Update rows 1-5 to their current values. The trigger should increment the con_id.
INSERT INTO data.test_con_ids (id)
VALUES (1),(2),(3),(4),(5)
ON CONFLICT(id) DO UPDATE set id = EXCLUDED.id; -- Completely pointless, obviously...just a test. We always UPSERT with ON CONFLICT DO UPDATE.
COMMIT;
And here are the results:
id con_id dts
1 12 2019-11-02 21:52:34.333926+11
2 14 2019-11-02 21:52:34.333926+11
3 16 2019-11-02 21:52:34.333926+11
4 18 2019-11-02 21:52:34.333926+11
5 20 2019-11-02 21:52:34.333926+11
6 6 2019-11-02 21:52:34.333926+11
7 7 2019-11-02 21:52:34.333926+11
8 8 2019-11-02 21:52:34.333926+11
9 9 2019-11-02 21:52:34.333926+11
10 10 2019-11-02 21:52:34.333926+11
That works, 1-10 are created, 1-5 are updated and get their con_id counter values increment. By 2 for some reason (?), but at least they're in a useful order, which is what we need.
Can anyone offer suggestions on how to get this behavior more efficiently? The goal is an ever-increasing number line for records reflecting last INSERT and UPDATE activities. And, because we use integers for this everywhere else, we're trying to stick with integers instead of timestamps. But, honestly, that's cosmetic in a lot of ways. Another reason I'm looking at SEQUENCE is that, unless I've misunderstood, it's not bound up in the transaction. That's perfect for this...we don't need a gapless number series, just a sequential one.
Postgres 12 Test
Following on Belayer's suggestion, I created a PG 12 database as an experiment. I went with the defaults, so everything is in public. (In the real world, I strip out public.) Yes, a generated column seems to work, so long as you have an immutable function. I've read about IMMUTABLE in Postgres several times....and I don't get it. So, I can't say that this function is safe. Seems like it should be. I followed the patterns used in this worthwhile piece:
https://www.2ndquadrant.com/en/blog/generated-columns-in-postgresql-12/
CREATE OR REPLACE FUNCTION public.generate_concurrency_id() RETURNS bigint
AS $$
SELECT EXTRACT(EPOCH FROM clock_timestamp())::bigint;
$$
LANGUAGE sql IMMUTABLE;
COMMENT ON FUNCTION public.generate_concurrency_id() IS 'Generate a bigint to act as a progressive change counter for a table.';
DROP TABLE IF EXISTS public.test_con_ids;
CREATE TABLE test_con_ids (
id integer NOT NULL DEFAULT NULL PRIMARY KEY,
con_id bigint GENERATED ALWAYS AS (generate_concurrency_id()) STORED,
dts timestamptz default NOW()
);
-- Add ten records, IDs 1-10, con_ids 1-10.
INSERT INTO public.test_con_ids (id)
VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10);
-- Unless you wait here, the con_ids all look the same...this operation is way too short to tick over to another second, unless you get lucky.
-- Update rows 1-5 to their current values. The trigger should increment the con_id.
INSERT INTO public.test_con_ids (id)
VALUES (1),(2),(3),(4),(5)
ON CONFLICT(id) DO UPDATE set id = EXCLUDED.id; -- Completely pointless, obviously...just a test. We always UPSERT with ON CONFLICT DO UPDATE.
The example above does work with the generated column, but I don't know about the performance characteristics of PG 12 calculated columns vs. triggers.
On a sadder note, I'm thinking that this might not work out for me at all. I really need a commit time stamp, which is only available through logical decoding. Here's why, through an example.
I'm summarizing data and want to grab unprocessed updated or inserted rows.
The last number I got is from a record added at 01:00:00 AM. Now I want to get all higher numbers.
Okay...I do that.
Oh wait, an incomplete transaction comes in and commits timestamps/derived numbers from earlier.
A SEQUENCE doesn't really work here either because, again, the row/tuple isn't visible to my collector process until after the transaction commits.
So I think it's logical decoding, an update summary table, or bust.
This is a clear use case for sequences.
CREATE FUNCTION con_id_update() RETURNS trigger
LANGUAGE plpgsql AS $$
BEGIN
new.con_id := nextval(pg_get_serial_sequence(TG_TABLE_SCHEMA || '.' || TG_TABLE_NAME, 'con_id'));
RETURN new;
END;
$$;
CREATE TABLE example (
id bigserial PRIMARY KEY,
con_id serial NOT NULL
);
CREATE TRIGGER con_id BEFORE UPDATE ON example
EXECUTE FUNCTION con_id_update();
Or something like that.
The Sad and Terrible Truth
I've realized that I'm being clueless about transactions, and that none of these schemes can work. Not only can't this work for UPDATE, it can't work for INSERT either, which is a bit of a surprise. If I've got something wrong here, please tell me where.
As a thought experiment, imagine three concurrent, uncommitted transactions on a table where the concurrency_id number is incremented on INSERT or UPDATE. Here's a simple picture.
Now, what is the max(concurrency_id) in the table? It's meaningless as none of the transactions have committed yet. If you're using a sequence, you could grab the nextval and then know that anything with a lower number is "earlier". Okay, imagine I use max in production, what do I get? It depends both on when I ask, and what order/state the transactions are in. Here's a little truth table with outcomes for max() and nextval() under the various combinations:
Scenario T1 T2 T3 Nextval() Max()
1 Committed Committed Committed 16 15
2 Committed Committed Open 16 10
3 Committed Open Committed 16 15
4 Committed Open Open 16 5
5 Open Committed Committed 16 15
6 Open Committed Open 16 10
7 Open Open Committed 16 15
8 Open Open Open 16 0
The whole goal here is to lay down markers on the number line (or timeline) to bracket out already processed rows. The processed status isn't a property of the rows, as the same rows can be used in many operations. Such as multiple rollups, sync to various sources, archiving, etc.
If I want to know what to process, I need the last processed value, 0/NULL at the start, and the maximum value to process. If I run the check in scenarios 1, 3, 5, or 7, I get the same max value, 15. However, the committed records at that point vary
Scenario T1 T2 T3 Nextval() Max() Values
1 Committed Committed Committed 16 15 1-15
3 Committed Open Committed 16 15 1-5 and 11-15
5 Open Committed Committed 16 15 6-15
7 Open Open Committed 16 15 11-15
In every case, the processing window tracker stores the max value, 15 in each case. No harm so far. But what happens when I next run my check, and want to find all of the unprocessed rows the next time? They look staring at > 16. The net result is that only scenario 1 works, all of the others end up with records with earlier numbers/timestamps committing after I ran my check, and so those numbers are never going to be captured. As far as I can tell, using a sequence can't help. While a sequence is not transaction-bound, the row it represents is transaction bound. And timestamps don't help as there is no timestamp-in-actual-commit-order function available.
I think that the only solution (as I think others have said to me elsewhere already) is logical decoding or replication. The difference there is that the replication stream is guaranteed to be in playback, commit order.
The notes above are a thought experiment(ish). If St. Dijkstra, gave us anything, it's that we can, and should, use thought experiments to reason about concurrency issues. If I've mis-reasoned, overlooked something, etc., please answer or comment.
I have an app which does have a maintenance menu (which is of course used in only very rare cases). In this menu I also display the next number that a certain sequence will generate next and give the user an option to reset the sequence.
I use the following query to show the next number:
select case when last_value is not null then last_value+increment_by else start_value end
from pg_sequences
where sequencename = 'my_sequence'
And if the user changes the sequence I run:
alter sequence my_sequence restart with $NEW_NUMBER
This usually works EXCEPT right after resetting the sequence with the query above and before any new number was pulled from the sequence. The my query to find out what the next number would be shows "1" which is not necessarily correct.
What can I do to reliably determine the next number that nextval would produce if called without actually calling nextval (to not actually alter the sequence)?
Use setval() instead of alter sequence..., e.g.
select setval('my_sequence', 110)
The last sequence value written to disk (last_value of pg_sequences) can be set only by nontransactional functions nextval() and setval(). After the sequence is restarted with alter sequence ... the last value is not saved until it is actually used by nextval() or set by setval(). This is because alter sequence ... may be rolled back.
In PostgreSQL (9.3) I have a table defined as:
CREATE TABLE charts
( recid serial NOT NULL,
groupid text NOT NULL,
chart_number integer NOT NULL,
"timestamp" timestamp without time zone NOT NULL DEFAULT now(),
modified timestamp without time zone NOT NULL DEFAULT now(),
donotsee boolean,
CONSTRAINT pk_charts PRIMARY KEY (recid),
CONSTRAINT chart_groupid UNIQUE (groupid),
CONSTRAINT charts_ichart_key UNIQUE (chart_number)
);
CREATE TRIGGER update_modified
BEFORE UPDATE ON charts
FOR EACH ROW EXECUTE PROCEDURE update_modified();
I would like to replace the chart_number with a sequence like:
CREATE SEQUENCE charts_chartnumber_seq START 16047;
So that by trigger or function, adding a new chart record automatically generates a new chart number in ascending order. However, no existing chart record can have its chart number changed and over the years there have been skips in the assigned chart numbers. Hence, before assigning a new chart number to a new chart record, I need to be sure that the "new" chart number has not yet been used and any chart record with a chart number is not assigned a different number.
How can this be done?
Consider not doing it. Read these related answers first:
Gap-less sequence where multiple transactions with multiple tables are involved
Compacting a sequence in PostgreSQL
If you still insist on filling in gaps, here is a rather efficient solution:
1. To avoid searching large parts of the table for the next missing chart_number, create a helper table with all current gaps once:
CREATE TABLE chart_gap AS
SELECT chart_number
FROM generate_series(1, (SELECT max(chart_number) - 1 -- max is no gap
FROM charts)) chart_number
LEFT JOIN charts c USING (chart_number)
WHERE c.chart_number IS NULL;
2. Set charts_chartnumber_seq to the current maximum and convert chart_number to an actual serial column:
SELECT setval('charts_chartnumber_seq', max(chart_number)) FROM charts;
ALTER TABLE charts
ALTER COLUMN chart_number SET NOT NULL
, ALTER COLUMN chart_number SET DEFAULT nextval('charts_chartnumber_seq');
ALTER SEQUENCE charts_chartnumber_seq OWNED BY charts.chart_number;
Details:
How to reset postgres' primary key sequence when it falls out of sync?
Safely and cleanly rename tables that use serial primary key columns in Postgres?
3. While chart_gap is not empty fetch the next chart_number from there.
To resolve possible race conditions with concurrent transactions, without making transactions wait, use advisory locks:
WITH sel AS (
SELECT chart_number, ... -- other input values
FROM chart_gap
WHERE pg_try_advisory_xact_lock(chart_number)
LIMIT 1
FOR UPDATE
)
, ins AS (
INSERT INTO charts (chart_number, ...) -- other target columns
TABLE sel
RETURNING chart_number
)
DELETE FROM chart_gap c
USING ins i
WHERE i.chart_number = c.chart_number;
Alternatively, Postgres 9.5 or later has the handy FOR UPDATE SKIP LOCKED to make this simpler and faster:
...
SELECT chart_number, ... -- other input values
FROM chart_gap
LIMIT 1
FOR UPDATE SKIP LOCKED
...
Detailed explanation:
Postgres UPDATE ... LIMIT 1
Check the result. Once all rows are filled in, this returns 0 rows affected. (you could check in plpgsql with IF NOT FOUND THEN ...). Then switch to a simple INSERT:
INSERT INTO charts (...) -- don't list chart_number
VALUES (...); -- don't provide chart_number
In PostgreSQL, a SEQUENCE ensures the two requirements you mention, that is:
No repeats
No changes once assigned
But because of how a SEQUENCE works (see manual), it can not ensure no-skips. Among others, the first two reasons that come to mind are:
How a SEQUENCE handles concurrent blocks with INSERTS (you could also add that the concept of Cache also makes this impossible)
Also, user triggered DELETEs are an uncontrollable aspect that a SEQUENCE can not handle by itself.
In both cases, if you still do not want skips, (and if you really know what you're doing) you should have a separate structure that assign IDs (instead of using SEQUENCE). Basically a system that has a list of 'assignable' IDs stored in a TABLE that has a function to pop out IDs in a FIFO way. That should allow you to control DELETEs etc.
But again, this should be attempted, only if you really know what you're doing! There's a reason why people don't do SEQUENCEs themselves. There are hard corner-cases (for e.g. concurrent INSERTs) and most probably you're over-engineering your problem case, that probably can be solved in a much better / cleaner way.
Sequence numbers usually have no meaning, so why worry? But if you really want this, then follow the below, cumbersome procedure. Note that it is not efficient; the only efficient option is to forget about the holes and use the sequence.
In order to avoid having to scan the charts table on every insert, you should scan the table once and store the unused chart_number values in a separate table:
CREATE TABLE charts_unused_chart_number AS
SELECT seq.unused
FROM (SELECT max(chart_number) FROM charts) mx,
generate_series(1, mx(max)) seq(unused)
LEFT JOIN charts ON charts.chart_number = seq.unused
WHERE charts.recid IS NULL;
The above query generates a contiguous series of numbers from 1 to the current maximum chart_number value, then LEFT JOINs the charts table to it and find the records where there is no corresponding charts data, meaning that value of the series is unused as a chart_number.
Next you create a trigger that fires on an INSERT on the charts table. In the trigger function, pick a value from the table created in the step above:
CREATE FUNCTION pick_unused_chart_number() RETURNS trigger AS $$
BEGIN
-- Get an unused chart number
SELECT unused INTO NEW.chart_number FROM charts_unused_chart_number LIMIT 1;
-- If the table is empty, get one from the sequence
IF NOT FOUND THEN
NEW.chart_number := next_val(charts_chartnumber_seq);
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER tr_charts_cn
BEFORE INSERT ON charts
FOR EACH ROW EXECUTE PROCEDURE pick_unused_chart_number();
Easy. But the INSERT may fail because of some other trigger aborting the procedure or any other reason. So you need a check to ascertain that the chart_number was indeed inserted:
CREATE FUNCTION verify_chart_number() RETURNS trigger AS $$
BEGIN
-- If you get here, the INSERT was successful, so delete the chart_number
-- from the temporary table.
DELETE FROM charts_unused_chart_number WHERE unused = NEW.chart_number;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER tr_charts_verify
AFTER INSERT ON charts
FOR EACH ROW EXECUTE PROCEDURE verify_chart_number();
At a certain point the table with unused chart numbers will be empty whereupon you can (1) ALTER TABLE charts to use the sequence instead of an integer for chart_number; (2) delete the two triggers; and (3) the table with unused chart numbers; all in a single transaction.
While what you want is possible, it can't be done using only a SEQUENCE and it requires an exclusive lock on the table, or a retry loop, to work.
You'll need to:
LOCK thetable IN EXCLUSIVE MODE
Find the first free ID by querying for the max id then doing a left join over generate_series to find the first free entry. If there is one.
If there is a free entry, insert it.
If there is no free entry, call nextval and return the result.
Performance will be absolutely horrible, and transactions will be serialized. There'll be no concurrency. Also, unless the LOCK is the first thing you run that affects that table, you'll face deadlocks that cause transaction aborts.
You can make this less bad by using an AFTER DELETE .. FOR EACH ROW trigger that keeps track of entries you delete by INSERTing them into a one-column table that keeps track of spare IDs. You can then SELECT the lowest ID from the table in your ID assignment function on the default for the column, avoiding the need for the explicit table lock, the left join on generate_series and the max call. Transactions will still be serialized on a lock on the free IDs table. In PostgreSQL you can even solve that using SELECT ... FOR UPDATE SKIP LOCKED. So if you're on 9.5 you can actually make this non-awful, though it'll still be slow.
I strongly advise you to just use a SEQUENCE directly, and not bother with re-using values.
I have a simple question, suppose we have a table:
id A B
1 Jon Doe
2 Foo Bar
Is there a way to know, which is the next id's increment, in this case 3 ?
Database is PostgreSQL!
Tnx alot!
If you want to claim an ID and return it, you can use nextval(), which advances the sequence without inserting any data.
Note that if this is a SERIAL column, you need to find the sequence's name based on the table and column name, as follows:
Select nextval(pg_get_serial_sequence('my_table', 'id')) as new_id;
There is no cast-iron guarantee that you'll see these IDs come back in order (the sequence generates them in order, but multiple sessions can claim an ID and not use it yet, or roll back an INSERT and the ID will not be reused) but there is a guarantee that they will be unique, which is normally the important thing.
If you do this often without actually using the ID, you will eventually use up all the possible values of a 32-bit integer column (i.e. reach the maximum representable integer), but if you use it only when there's a high chance you will actually be inserting a row with that ID it should be OK.
To get the current value of a sequence without affecting it or needing a previous insert in the same session, you can use;
SELECT last_value FROM tablename_fieldname_seq;
An SQLfiddle to test with.
Of course, getting the current value will not guarantee that the next value you'll get is actually last_value + 1 if there are other simultaneous sessions doing inserts, since another session may have taken the serial value before you.
SELECT currval('names_id_seq') + 1;
See the docs
However, of course, there's no guarantee that it's going to be your next value. What if another client grabs it before you? You can though reserve one of the next values for yourself, selecting a nextval from the sequence.
I'm new so here's the process I use having little to no prior knowledge of how Postgres/SQL work:
Find the sequence for your table using pg_get_serial_sequence()
SELECT pg_get_serial_sequence('person','id');
This should output something like public.person_id_seq. person_id_seq is the sequence for your table.
Plug the sequence from (1) into nextval()
SELECT nextval('person_id_seq');
This will output an integer value which will be the next id added to the table.
You can turn this into a single command as mentioned in the accepted answer above
SELECT nextval(pg_get_serial_sequence('person','id'));
If you notice that the sequence is returning unexpected values, you can set the current value of the sequence using setval()
SELECT setval(pg_get_serial_sequence('person','id'),1000);
In this example, the next call to nextval() will return 1001.
I want to do a large update on a table in PostgreSQL, but I don't need the transactional integrity to be maintained across the entire operation, because I know that the column I'm changing is not going to be written to or read during the update. I want to know if there is an easy way in the psql console to make these types of operations faster.
For example, let's say I have a table called "orders" with 35 million rows, and I want to do this:
UPDATE orders SET status = null;
To avoid being diverted to an offtopic discussion, let's assume that all the values of status for the 35 million columns are currently set to the same (non-null) value, thus rendering an index useless.
The problem with this statement is that it takes a very long time to go into effect (solely because of the locking), and all changed rows are locked until the entire update is complete. This update might take 5 hours, whereas something like
UPDATE orders SET status = null WHERE (order_id > 0 and order_id < 1000000);
might take 1 minute. Over 35 million rows, doing the above and breaking it into chunks of 35 would only take 35 minutes and save me 4 hours and 25 minutes.
I could break it down even further with a script (using pseudocode here):
for (i = 0 to 3500) {
db_operation ("UPDATE orders SET status = null
WHERE (order_id >" + (i*1000)"
+ " AND order_id <" + ((i+1)*1000) " + ")");
}
This operation might complete in only a few minutes, rather than 35.
So that comes down to what I'm really asking. I don't want to write a freaking script to break down operations every single time I want to do a big one-time update like this. Is there a way to accomplish what I want entirely within SQL?
Column / Row
... I don't need the transactional integrity to be maintained across
the entire operation, because I know that the column I'm changing is
not going to be written to or read during the update.
Any UPDATE in PostgreSQL's MVCC model writes a new version of the whole row. If concurrent transactions change any column of the same row, time-consuming concurrency issues arise. Details in the manual. Knowing the same column won't be touched by concurrent transactions avoids some possible complications, but not others.
Index
To avoid being diverted to an offtopic discussion, let's assume that
all the values of status for the 35 million columns are currently set
to the same (non-null) value, thus rendering an index useless.
When updating the whole table (or major parts of it) Postgres never uses an index. A sequential scan is faster when all or most rows have to be read. On the contrary: Index maintenance means additional cost for the UPDATE.
Performance
For example, let's say I have a table called "orders" with 35 million
rows, and I want to do this:
UPDATE orders SET status = null;
I understand you are aiming for a more general solution (see below). But to address the actual question asked: This can be dealt with in a matter milliseconds, regardless of table size:
ALTER TABLE orders DROP column status
, ADD column status text;
The manual (up to Postgres 10):
When a column is added with ADD COLUMN, all existing rows in the table
are initialized with the column's default value (NULL if no DEFAULT
clause is specified). If there is no DEFAULT clause, this is merely a metadata change [...]
The manual (since Postgres 11):
When a column is added with ADD COLUMN and a non-volatile DEFAULT
is specified, the default is evaluated at the time of the statement
and the result stored in the table's metadata. That value will be used
for the column for all existing rows. If no DEFAULT is specified,
NULL is used. In neither case is a rewrite of the table required.
Adding a column with a volatile DEFAULT or changing the type of an
existing column will require the entire table and its indexes to be
rewritten. [...]
And:
The DROP COLUMN form does not physically remove the column, but
simply makes it invisible to SQL operations. Subsequent insert and
update operations in the table will store a null value for the column.
Thus, dropping a column is quick but it will not immediately reduce
the on-disk size of your table, as the space occupied by the dropped
column is not reclaimed. The space will be reclaimed over time as
existing rows are updated.
Make sure you don't have objects depending on the column (foreign key constraints, indices, views, ...). You would need to drop / recreate those. Barring that, tiny operations on the system catalog table pg_attribute do the job. Requires an exclusive lock on the table which may be a problem for heavy concurrent load. (Like Buurman emphasizes in his comment.) Baring that, the operation is a matter of milliseconds.
If you have a column default you want to keep, add it back in a separate command. Doing it in the same command applies it to all rows immediately. See:
Add new column without table lock?
To actually apply the default, consider doing it in batches:
Does PostgreSQL optimize adding columns with non-NULL DEFAULTs?
General solution
dblink has been mentioned in another answer. It allows access to "remote" Postgres databases in implicit separate connections. The "remote" database can be the current one, thereby achieving "autonomous transactions": what the function writes in the "remote" db is committed and can't be rolled back.
This allows to run a single function that updates a big table in smaller parts and each part is committed separately. Avoids building up transaction overhead for very big numbers of rows and, more importantly, releases locks after each part. This allows concurrent operations to proceed without much delay and makes deadlocks less likely.
If you don't have concurrent access, this is hardly useful - except to avoid ROLLBACK after an exception. Also consider SAVEPOINT for that case.
Disclaimer
First of all, lots of small transactions are actually more expensive. This only makes sense for big tables. The sweet spot depends on many factors.
If you are not sure what you are doing: a single transaction is the safe method. For this to work properly, concurrent operations on the table have to play along. For instance: concurrent writes can move a row to a partition that's supposedly already processed. Or concurrent reads can see inconsistent intermediary states. You have been warned.
Step-by-step instructions
The additional module dblink needs to be installed first:
How to use (install) dblink in PostgreSQL?
Setting up the connection with dblink very much depends on the setup of your DB cluster and security policies in place. It can be tricky. Related later answer with more how to connect with dblink:
Persistent inserts in a UDF even if the function aborts
Create a FOREIGN SERVER and a USER MAPPING as instructed there to simplify and streamline the connection (unless you have one already).
Assuming a serial PRIMARY KEY with or without some gaps.
CREATE OR REPLACE FUNCTION f_update_in_steps()
RETURNS void AS
$func$
DECLARE
_step int; -- size of step
_cur int; -- current ID (starting with minimum)
_max int; -- maximum ID
BEGIN
SELECT INTO _cur, _max min(order_id), max(order_id) FROM orders;
-- 100 slices (steps) hard coded
_step := ((_max - _cur) / 100) + 1; -- rounded, possibly a bit too small
-- +1 to avoid endless loop for 0
PERFORM dblink_connect('myserver'); -- your foreign server as instructed above
FOR i IN 0..200 LOOP -- 200 >> 100 to make sure we exceed _max
PERFORM dblink_exec(
$$UPDATE public.orders
SET status = 'foo'
WHERE order_id >= $$ || _cur || $$
AND order_id < $$ || _cur + _step || $$
AND status IS DISTINCT FROM 'foo'$$); -- avoid empty update
_cur := _cur + _step;
EXIT WHEN _cur > _max; -- stop when done (never loop till 200)
END LOOP;
PERFORM dblink_disconnect();
END
$func$ LANGUAGE plpgsql;
Call:
SELECT f_update_in_steps();
You can parameterize any part according to your needs: the table name, column name, value, ... just be sure to sanitize identifiers to avoid SQL injection:
Table name as a PostgreSQL function parameter
Avoid empty UPDATEs:
How do I (or can I) SELECT DISTINCT on multiple columns?
Postgres uses MVCC (multi-version concurrency control), thus avoiding any locking if you are the only writer; any number of concurrent readers can work on the table, and there won't be any locking.
So if it really takes 5h, it must be for a different reason (e.g. that you do have concurrent writes, contrary to your claim that you don't).
You should delegate this column to another table like this:
create table order_status (
order_id int not null references orders(order_id) primary key,
status int not null
);
Then your operation of setting status=NULL will be instant:
truncate order_status;
First of all - are you sure that you need to update all rows?
Perhaps some of the rows already have status NULL?
If so, then:
UPDATE orders SET status = null WHERE status is not null;
As for partitioning the change - that's not possible in pure sql. All updates are in single transaction.
One possible way to do it in "pure sql" would be to install dblink, connect to the same database using dblink, and then issue a lot of updates over dblink, but it seems like overkill for such a simple task.
Usually just adding proper where solves the problem. If it doesn't - just partition it manually. Writing a script is too much - you can usually make it in a simple one-liner:
perl -e '
for (my $i = 0; $i <= 3500000; $i += 1000) {
printf "UPDATE orders SET status = null WHERE status is not null
and order_id between %u and %u;\n",
$i, $i+999
}
'
I wrapped lines here for readability, generally it's a single line. Output of above command can be fed to psql directly:
perl -e '...' | psql -U ... -d ...
Or first to file and then to psql (in case you'd need the file later on):
perl -e '...' > updates.partitioned.sql
psql -U ... -d ... -f updates.partitioned.sql
I am by no means a DBA, but a database design where you'd frequently have to update 35 million rows might have… issues.
A simple WHERE status IS NOT NULL might speed up things quite a bit (provided you have an index on status) – not knowing the actual use case, I'm assuming if this is run frequently, a great part of the 35 million rows might already have a null status.
However, you can make loops within the query via the LOOP statement. I'll just cook up a small example:
CREATE OR REPLACE FUNCTION nullstatus(count INTEGER) RETURNS integer AS $$
DECLARE
i INTEGER := 0;
BEGIN
FOR i IN 0..(count/1000 + 1) LOOP
UPDATE orders SET status = null WHERE (order_id > (i*1000) and order_id <((i+1)*1000));
RAISE NOTICE 'Count: % and i: %', count,i;
END LOOP;
RETURN 1;
END;
$$ LANGUAGE plpgsql;
It can then be run by doing something akin to:
SELECT nullstatus(35000000);
You might want to select the row count, but beware that the exact row count can take a lot of time. The PostgreSQL wiki has an article about slow counting and how to avoid it.
Also, the RAISE NOTICE part is just there to keep track on how far along the script is. If you're not monitoring the notices, or do not care, it would be better to leave it out.
Are you sure this is because of locking? I don't think so and there's many other possible reasons. To find out you can always try to do just the locking. Try this:
BEGIN;
SELECT NOW();
SELECT * FROM order FOR UPDATE;
SELECT NOW();
ROLLBACK;
To understand what's really happening you should run an EXPLAIN first (EXPLAIN UPDATE orders SET status...) and/or EXPLAIN ANALYZE. Maybe you'll find out that you don't have enough memory to do the UPDATE efficiently. If so, SET work_mem TO 'xxxMB'; might be a simple solution.
Also, tail the PostgreSQL log to see if some performance related problems occurs.
I would use CTAS:
begin;
create table T as select col1, col2, ..., <new value>, colN from orders;
drop table orders;
alter table T rename to orders;
commit;
Some options that haven't been mentioned:
Use the new table trick. Probably what you'd have to do in your case is write some triggers to handle it so that changes to the original table also go propagated to your table copy, something like that... (percona is an example of something that does it the trigger way). Another option might be the "create a new column then replace the old one with it" trick, to avoid locks (unclear if helps with speed).
Possibly calculate the max ID, then generate "all the queries you need" and pass them in as a single query like update X set Y = NULL where ID < 10000 and ID >= 0; update X set Y = NULL where ID < 20000 and ID > 10000; ... then it might not do as much locking, and still be all SQL, though you do have extra logic up front to do it :(
PostgreSQL version 11 handles this for you automatically with the Fast ALTER TABLE ADD COLUMN with a non-NULL default feature. Please do upgrade to version 11 if possible.
An explanation is provided in this blog post.