There are set of values to update. Example: table t1 has column c1 which has to be updated from 1 to x. There are around 300 such sets available in a file and around 15 such tables with over 100k of records.
What is the optimal way of doing this?
Approaches I can think of are:
individual update statement for old with new value in all tables
programmatically read the file and create dynamic update statement
using merge into table syntax
In one of the tables the column is primary key with tables referencing them as foreign key
Related
I have about 300 tables in my Postgres (PostgreSQL 9.6.5-1) database. The tables are large, each with about 6 million records. To insert the records, I created the tables without any indexes as I have found it is substantially faster to insert without any. I did not add an ID column (primary key, auto increment, unique), either.
I now need to add indices to each table, as well as a new ID column.
To do this, I use the following commands:
CREATE INDEX IF NOT EXISTS some_table_1_index ON some_table_1 (latitude, longitude, measurement_time, level, speed, altitude);
ALTER TABLE some_table_1 ADD COLUMN id SERIAL PRIMARY KEY;
I have found that it takes between 30 and 90 seconds per command...meaning that it would take 7h30 to do all my tables (assuming a worst-case scenario of 90s per command).
Is there a faster way to alter all my tables?
I am using Python and psycopg2, if that makes any difference.
First, your command doesn't create four indices. It creates two indices in which the first is a composite index (which may not be exactly what you want because column order matters whether or not the planner will choose to use the index).
Second, are you executing the CREATE commands serially? Could you run all 300 create commands in parallel?
Psuedo code since I don't know Python well:
tableList = ['table1', 'table2', 'table3', ...]
createSql = 'CREATE INDEX...[0]...'
[executeInThread(table) for table in tableList]
We have two copies of a simple application that is based on SQLite. The application has 10 tables with a variety of relations between the tables. We would like to merge the databases to a single Postgres database with the same schema. We can use Talend to facilitate this, however the issue is that there would be duplicate keys (as both the source databases are independent). Is there a systematic method by which we can insert data into Postgres with the original key plus an offset resulting from loading the first database?
Step 1. Restore the first database.
Step 2. Change foreign keys of all tables by adding the option on update cascade.
For example, if the column table_b.a_id refers to the column table_a.id:
alter table table_b
drop constraint table_b_a_id_fkey,
add constraint table_b_a_id_fkey
foreign key (a_id) references table_a(id)
on update cascade;
Step 3. Update primary keys of the tables by adding the desired offset, e.g.:
update table_a
set id = 10000+ id;
Step 4. Restore the second database.
If you have the possibility to edit the script with database schema (or do the transfer manually with your own script), you can merge steps 1 and 2 and edit the script before the restore (adding the option on update cascade for foreign keys in tables declarations).
Assume I have a table named tracker with columns (issue_id,ingest_date,verb,priority)
I would like to add 50 columns to this table.
Columns being (string_ch_01,string_ch_02,.....,string_ch_50) of datatype varchar.
Is there any better way to add columns with single procedure rather than executing the following alter command 50 times?
ALTER TABLE tracker ADD COLUMN string_ch_01 varchar(1020);
Yes, a better way is to issue a single ALTER TABLE with all the columns at once:
ALTER TABLE tracker
ADD COLUMN string_ch_01 varchar(1020),
ADD COLUMN string_ch_02 varchar(1020),
...
ADD COLUMN string_ch_50 varchar(1020)
;
It's especially better when there are DEFAULT non-null clauses for the new columns, since each of them would rewrite the entire table, as opposed to rewriting it only once if they're grouped in a single ALTER TABLE.
I was trying to insert values from one table to another from two different databases.
My issue is I have two tables with a relation and the first table is having an identity column also.
eg table first(id, Name) - table second(id, address)
So now both the table exist with values in a db and i am trying to copy values from this db to another db.
So when I insert values from first db to second db the the first table will insert values for the Id column by itself so now I have to link that id to the second table.
How can I do that?
UPDATE using MSSQL server 2000
You can use #scope_identity immediately after your insert in SQL server 2000 which will give you the last id within the current scope but I'm not sure how that would work with bulk inserting of data
http://msdn.microsoft.com/en-us/library/ms190315.aspx
If this were SQL Server 2005 or later I would suggest using the output clause in your insert statement to retrieve the ids just inserted, but that was not available in SQL Server 2000.
If your data contains some column or series of columns which is unique other than the identity column, then you can query your first table based on that series of columns to get the ids and use that to populate your second table.
If the target tables were empty you could use SET IDENTITY_INSERT ON - this would allow to insert original values to identity columns, and you will not have to update referenced IDs. Of course if there is any existing ids that can overlap inserted ids - that is not the solution.
If names in first tables are unique, you could boild mapping between new and old ids and perform update something like this:
UPDATE S
SET S.id = F.id
FROM second S
INNER JOIN first_original FO ON FO.id = S.id
INNER JOIN first F ON F.name = FO.name
If names are not unique, then original ids should be saved in "first" in order to provide mapping between old and new ids. It can be temporary new column that can be deleted after ids in "second" will be updated.
Or as Rich Andrews said you could use #scope_identity, but in this case you will have to perform insert one by one - declare a cursor on source table, insert each record, get its new id and insert it into "second" table.
I have one master table. its primary key is used as foreign key in other tables. I can not modify the definition of the other tables as it does not have any "on update cascade" and i want to change the primary key's value so the other tables i should update...
currently i have written the plpgsql,
but as I have large amount of data to process, somehow it is slowing down the performance.
Can someone help me, how to update multiple table in single query, or updating multiple rows of-course with different values?
Here's one option to update multiple rows in one statement:
update mytable set
mycolumn = (case myid when 1 then 'a' when 2 then 'b' ... end)
where myid in (1, 2, ...);