Copy Postgres table while maintaining primary key autoincrement - postgresql

I am trying to copy a table with this postgres command however the primary key autoincrement feature does not copy over. Is there any quick and simple way to accomplish this? Thanks!
CREATE TABLE table2 AS TABLE table;

Here's what I'd do:
BEGIN;
LOCK TABLE oldtable;
CREATE TABLE newtable (LIKE oldtable INCLUDING ALL);
INSERT INTO newtable SELECT * FROM oldtable;
SELECT setval('the_seq_name', (SELECT max(id) FROM oldtable)+1);
COMMIT;
... though this is a moderately unusual thing to need to do and I'd be interested in what problem you're trying to solve.

Related

bigint id changed back to int during table rename

I hit the int limit on a large table I use.
The table is in single user mode and has no FK constraints.
CREATE TABLE my_table_bigint (LIKE my_table INCLUDING ALL);
ALTER TABLE my_table_bigint ALTER id DROP DEFAULT;
ALTER TABLE my_table_bigint alter column id set data type bigint;
CREATE SEQUENCE my_table_bigint_id_seq;
INSERT INTO my_table_bigint SELECT * FROM my_table;
ALTER TABLE my_table_bigint ALTER id SET DEFAULT nextval('my_table_bigint_id_seq');
ALTER SEQUENCE my_table_bigint_id_seq OWNED BY my_table_bigint.id;
SELECT setval('my_table_bigint_id_seq', (SELECT max(id) FROM my_table_bigint), true);
At this point I tested that I could insert new rows without any problems. Success, I thought.
I went about renaming the tables.
alter table my_table rename my_table_old
alter table my_table_bigint rename my_table
ALTER INDEX post_comments_pkey RENAME TO post_comments_old_pkey
ALTER INDEX post_comments_pkey_bigint RENAME TO post_comments_pkey
Now, when I checked the schema.... the table ID type had changed BACK to integer, instead of bigint.
Copying took about 3 days - so I am really, really hoping that I don't need to do this again. This is postgres10 on RDS.
EDIT
I'm going to take care of this problem like this:
Create a new table - call it my_table_bigint2.
Do this:
CREATE TABLE my_table_bigint2 (LIKE my_table INCLUDING ALL);
ALTER TABLE my_table_bigint2 ALTER id DROP DEFAULT;
ALTER TABLE my_table_bigint2 alter column id set data type bigint;
CREATE SEQUENCE my_table_bigint2_id_seq;
ALTER TABLE my_table_bigint2 ALTER id SET DEFAULT nextval('my_table_bigint2_id_seq');
ALTER SEQUENCE my_table_bigint2_id_seq OWNED BY my_table_bigint2.id;
And start populating that table with the new data. (This is fine given the usecase.)
In the meantime, I'm going to run
ALTER TABLE post_comments alter column id set data type bigint;
And finally, once that's done, I'm going to
INSERT INTO my_table SELECT * FROM my_table_bigint2;
My follow-up question - is this allowed? Will this create some interaction between the sequences? Should I use a new sequence?

How to remove columns for real in postgresql?

I have a large system, and table schema updates quite offtenly, I noticed that after times of removing and recreating new cloumn, limitation "tables can have at most 1600 columns" is shown, but still there are few columns in information_schema.columns.
I've tried vacuum full analyze, still not working, any way to avoid this limitation?
DO $$
declare tbname varchar(1024);
BEGIN
FOR i IN 1..1599 LOOP
tbname := 'alter table vacuum_test add column test' || CAST(i AS varchar(8)) ||' int';
EXECUTE tbname;
END LOOP;
END $$;
alter table vacuum_test drop column test1;
VACUUM FULL ANALYZE vacuum_test;
alter table vacuum_test add column test1 int;
result:
alter table vacuum_test add column test1 int
> ERROR: tables can have at most 1600 columns
> 时间: 0.054s
Unfortunately vacuum full does not remove dropped columns from the table (i.e. entries that have attisdropped = true in `pg_attribute). I would have expected that, but apparently this does not happen.
The only way to get rid of the hidden columns is to create a brand new table and copy the data to the new table.
Something along the lines:
create table new_table (like old_table including all);
insert into new_table
select *
from old_table;
Then drop the old table and rename the new one to the old name. Constraint and index names will be named differently, so you might want to rename them as well.
You will have the re-create all foreign keys (incoming and outgoing) manually as they are not included when using CREATE TABLE (LIKE ...).
Another option is to use pg_repack which does this transparently in the background without locking the table.

How to do PostgreSQL Bulk INSERT without Primary Key Violation [duplicate]

This question already has an answer here:
How to bulk insert only new rows in PostreSQL
(1 answer)
Closed 8 years ago.
I'm trying to achieve database abstraction in my project, but now I got stuck with doing a bulk INSERT in PostgreSQL. My project is in C# and I'm using PostgreSQL 9.3 with npgsql.dll 2.0.14.
For Microsoft SQL Server I'm doing the bulk INSERT simply by concatenating all statements and then performing an ExecuteNonQuery:
IF NOT EXISTS (SELECT id FROM table WHERE id = 1) INSERT INTO table (id) VALUES (1);
IF NOT EXISTS (SELECT id FROM table WHERE id = 2) INSERT INTO table (id) VALUES (2);
IF NOT EXISTS (SELECT id FROM table WHERE id = 3) INSERT INTO table (id) VALUES (3);
Though the IF-NOT-EXISTS clause can be substituted in PostgreSQL by a SELECT-WHERE, this approach unfortunately still doesn't work - because every single statement in PostgreSQL is committed separately.
So I googled for another solution and found the approach of using the COPY command along with NpgsqlCopySerializer/NpgsqlCopyIn to performantly "stream" the bulk data. But now I'm getting primary key violation errors all the time - 'cause the EXISTS/WHERE clause can seemingly not be used together with the COPY statement.
I would really like to avoid to do the INSERTs all one-by-one, as this will slow down my application extremely, so I hope that anyone solved this issue already!
Generally for this type of situation I'd have a separate staging table that does not have the PK constraint, which I'd populate using COPY (assuming the data were in a format for which it makes sense to do a COPY). Then I'd do something like:
insert into table
select a.*
from staging a
where not exists (select 1
from table
where a.id = b.id)
That approach isn't too far off from your original design.
I don't totally understand this part of your question, however, which doesn't even seem totally relevant to your question:
this approach unfortunately still doesn't work - because every single
statement in postgreSQL is committed separately.
That's not true at all, not for any RDBMS. Sure, auto-commit might be enabled on your client, but that doesn't mean that postgres commits every statement separately and that you can't disable the auto-commit. This approach would work:
begin;
insert into table (id) select 1 where not exists (select 1 from table where id = 1);
insert into table (id) select 2 where not exists (select 1 from table where id = 2);
insert into table (id) select 3 where not exists (select 1 from table where id = 3);
commit;
As you pointed out, however, if you've got more than a handful of such statements you'll quickly be hitting some performance concerns.

Restoring only some key values using COPY STDIN in Postgres?

I accidentally ran a query on live data that deleted 5000 odd rows. I made a backup before I did this, and the backup is in this format:
COPY table (id, "position", event) FROM stdin;
529 1 5283
648 1 6473
687 1 6853
\.
Problem is, if I run it, i get:
ERROR: duplicate key value violates unique constraint "table_pkey"
is there a way to alter this query to only insert the rows I deleted? Something like an "if exists, ignore" kind of thing? Normally I know this affects many things, but because it's literally just those entries that need to be replaced, I think something like this could work, but I don't know if it exists?
Easiest way may be to create a copy of the original table and restore to that.
Then insert to original table from copy where no entry exists in original.
e.g.
create table copy_table as select * from table where 1=2;
-- change the copy statement
COPY copy_table from stdin;
...
-- Insert to original
INSERT INTO table t1
SELECT ct.*
FROM copy_table ct
LEFT JOIN table t2 ON t2.id = ct.id -- assuming id is primary key
WHERE t2.id IS NULL;
No, this is unfortunately not possible using the COPY command.
You need to insert all rows into a staging table, then use insert into .. select ... where not exits (...) to copy the misssing rows from the staging table into the real table.

Flip flopping data tables in Postgres

I have a table of several million records which I am running a query against and inserting the results into another table which clients will query. This process takes about 20 seconds.
How can I run this query, building this new table without impacting any of the clients that might be running queries against the target table?
For instance. I'm running
BEGIN;
DROP TABLE target_table;
SELECT blah, blahX, blahY
INTO target_table
FROM source_table
GROUP BY blahX, blahY
COMMIT;
Which is then blocking queries to:
SELECT SUM(blah)
FROM target_table
WHERE blahX > x
In the days of working with some SQL Server DBA's I recall them creating temporary tables, and then flipping these in over the current table. Is this doable/practical in Postgres?
What you want here is to minimize the lock time, which of course if you include a query (that takes a while) in your transaction is not going to work.
In this case, I assume you're in fact refreshing that 'target_table' which contains the positions of the "blah" objects when you run your script is that correct ?
BEGIN;
CREATE TEMP TABLE temptable AS
SELECT blah, blahX, blahY
FROM source_table
GROUP BY blahX, blahY
COMMIT;
BEGIN;
TRUNCATE TABLE target_table
INSERT INTO target_table(blah,blahX,blahY)
SELECT blah,blahX,blahY FROM temptable;
DROP TABLE temptable;
COMMIT;
As mentioned in the comments, it will be faster to drop the index's before truncating and create them anew just after loading the data to avoid the unneeded index changes.
For the full details of what is and is not possible with postgreSQL in that regard :
http://postgresql.1045698.n5.nabble.com/ALTER-TABLE-REPLACE-WITH-td3305036i40.html
There's ALTER TABLE ... RENAME TO ...:
ALTER TABLE name
RENAME TO new_name
Perhaps you could select into an intermediate table and then drop target_table and rename the intermediate table to target_table.
I have no idea how this would interact with any queries that may be running against target_table when you try to do the rename.
You can create a table, drop a table, and rename a table in every version of SQL I've ever used.
BEGIN;
SELECT blah, blahX, blahY
INTO new_table
FROM source_table
GROUP BY blahX, blahY;
DROP TABLE target_table;
ALTER TABLE new_table RENAME TO target_table;
COMMIT;
I'm not sure off the top of my head whether you could use a temporary table for this in PostgreSQL. PostgreSQL creates temp tables in a special schema; you don't get to pick the schema. But you might be able to create it as a temporary table, drop the existing table, and move it with SET SCHEMA.
At some point, any of these will require a table lock. (Duh.) You might be able to speed things up a lot by putting the swappable table on a SSD.