How to clone or copy records in same table in postgres? - postgresql

How to clone or copy records in same table in PostgreSQL by creating temporary table.
trying to create clones of records from one table to the same table with changed name(which is basically composite key in that table).

You can do it all in one INSERT combined with a SELECT.
i.e. say you have the following table definition and data populated in it:
create table original
(
id serial,
name text,
location text
);
INSERT INTO original (name, location)
VALUES ('joe', 'London'),
('james', 'Munich');
And then you can INSERT doing the kind of switch you're talking about without using a TEMP TABLE, like this:
INSERT INTO original (name, location)
SELECT 'john', location
FROM original
WHERE name = 'joe';
Here's an sqlfiddle.
This should also be faster (although for tiny data sets probably not hugely so in absolute time terms), since it's doing only one INSERT and SELECT as opposed to an extra SELECT and CREATE TABLE plus an UPDATE.

Did a bit of research, came up with a logic :
Create temp table
Copy records into it
Update the records in temp table
Copy it back to original table
CREATE TEMP TABLE temporary AS SELECT * FROM ORIGINAL WHERE NAME='joe';
UPDATE TEMP SET NAME='john' WHERE NAME='joe';
INSERT INTO ORIGINAL SELECT * FROM temporary WHERE NAME='john';
Was wondering if there was any shorter way to do it.

Related

add a column to a table which just references an existing column

Is there a way to add a column alias to an existing table, which just references another existing column in the table? such that reads and writes to the new column name will go to the existing column name. Sort of how a view in postgres can act as a read / write alias:
create view temp_order_contacts as (select * from order_emails)
This will make read / write possible to order_emails table but by calling temp_order_contacts instead.
Is there something similar but for columns?
Assuming this is for backwards compatibility; you want to rename a column, but you also want to existing queries to still work.
You can rename the table and create a view with the original name.
-- Move the existing table out of the way.
alter table some_table rename to _some_table;
-- Create a view in its place.
create view some_table as (
select
*,
-- provide a column alias
some_column as some_other_column
from _some_table
);

Delete duplicates from a huge table in Postgresql

I have an unusual problem: I need to delete duplicate records from a table in Postgresql. As i have duplicate records so i dont have primary key and unique index in this table. The table conatins like 20million records and it has duplicate records in it. While i am trying the below query it is taking too long time.
'DELETE FROM temp a using temp b where a.recordid=b.recordid and a.ctid < b.ctid;'
So what should be a better approach to handle such huge table with no index in it?
Appreciate for help.
if you have enough empty space, your can copy table without duplicates, then remove old table and rename new table
like this
INSERT INTO new_table
VALUES
SELECT
DISTINCT ON (column)
*
FROM old_table
ORDER BY column ASC
Use COPY TO to dump the table.
Then Unix sort -u to de-duplicate it.
Drop or truncate the table in Postgres, use COPY FROM to read it back in.
Add a primary key column.

Insert into table on conflict do update from csv

I have a table with two columns, id that is varchar and data that is jsonb. I also have a csv-file with new IDs that I would like to insert into the table, the data I would like to assign to these IDs are identical, and if an ID already exists I would like to update the current data value with the new data. This is what I have done so far:
INSERT INTO "table" ("id", "data")
VALUES ('[IDs from CSV file]', ' {dataObject}')
ON CONFLICT (id) do UPDATE set data='{dataObject}';
I have got it working with a single ID, but I would now like to run this for every ID in my csv-file, hence the array in the example to illustrate this. Is there a way to do this using a query? I was thinking I could create a temporary table and import the IDs there, but I am still not sure how I would utilize that table with my query.
Yes, use a staging table to upload your csv into, make sure to truncate it before uploading. After uploading:
insert into prod_table
select * from csv_upload
on conflict (id) do update
set data = excluded.data;
Don't complicate the process unnecessarily.
Import csv to a temporary table T2
Update T1 where rows match in T2
Insert into T1 from T2 where rows do not match

SQL Server - How to find a records in INSERTED when the database generates a primary key

I've never had to post a question on StackOverflow before because I can always find an answer here by just searching. Only this time, I think I've got a real stumper....
I'm writing code that automates the process of moving data from one SQL Server database to another. I have some pretty standard SQL Server Databases with foreign key relationships between some of their tables. Straight forward stuff. One of my requirements is that the entire table needs to be copied in one fell swoop, without looping through rows or using a cursor. Another requirement is I have to do this in SQL, no SSIS or other external helpers.
For example:
INSERT INTO TargetDatabase.dbo.MasterTable
SELECT * FROM SourceDatabase.dbo.MasterTable
That's easy enough. Then, once the data from the MasterTable has been moved, I move the data of the child table.
INSERT INTO TargetDatabase.dbo.ChildTable
SELECT * FROM SourceDatabase.dbo.ChildTable
Of course, in reality I use more explicit SQL... like I specifically name all the fields and things like that, but this is just a simplified version. Anyway, so far everything's going alright, except ...
The problem is that the primary key of the master table is defined as an identity field. So, when I insert into the MasterTable, the primary key for the new table gets calculated by the database. So to deal with that, I tried using the OUTPUT INTO statement to get the updated values into a Temp table:
INSERT INTO TargetDatabase.dbo.MasterTable
OUPUT INSERTED.* INTO #MyTempTable
SELECT * FROM SourceDatabase.dbo.MasterTable
So here's where it all falls apart. Since the database changed the primary key, how on earth do I figure out which record in the temp table matches up with the original record in the source table?
Do you see the problem? I know what the new ID is, I just don't know how to match it with the original record reliably. The SQL server lets me output the INSERTED values, but doesn't let me output the FROM TABLE values along side the INSERTED values. I've tried it with triggers, I've tried it with an SP, always I have the same problem.
If I were just updating one record at a time, I could easily match up my INSERTED values with the original record I was trying to insert to see the old and new primary key values, but I have this requirement to do it in a batch.
Any Ideas?
PS: I'm not allowed to change the table structure of the target or source table.
You can use MERGE.
declare #Source table (SourceID int identity(1,2), SourceName varchar(50))
declare #Target table (TargetID int identity(2,2), TargetName varchar(50))
insert into #Source values ('Row 1'), ('Row 2')
merge #Target as T
using #Source as S
on 0=1
when not matched then
insert (TargetName) values (SourceName)
output inserted.TargetID, S.SourceID;
Result:
TargetID SourceID
----------- -----------
2 1
4 3
Covered in this blog post by Adam Machanic: Dr. OUTPUT or: How I Learned to Stop Worrying and Love the MERGE
To illustrate what I mentioned in the comment:
SET IDENTITY_INSERT TargetDatabase.dbo.MasterTable ON
INSERT INTO TargetDatabase.dbo.MasterTable (IdentityColumn, OtherColumn1, OtherColumn2, ...)
SELECT IdentityColumn, OtherColumn1, OtherColumn2, ...
FROM SourceDatabase.dbo.MasterTable
SET IDENTITY_INSERT TargetDatabase.dbo.MasterTable OFF
Okay, since that didn't work for you (pre-existing values in target tables), how about adding a fixed increment (offset) to the id values in both tables (use the current max id value). Assuming the identity column is "id" in both tables:
DECLARE #incr int
BEGIN TRAN
SELECT #incr = max(id)
FROM TargetDatabase.dbo.MasterTable AS m WITH (TABLOCKX, HOLDLOCK)
SET IDENTITY_INSERT TargetDatabase.dbo.MasterTable ON
INSERT INTO TargetDatabase.dbo.MasterTable (id{, othercolumns...})
SELECT id+#incr{, othercolumns...}
FROM SourceDatabase.dbo.MasterTable
SET IDENTITY_INSERT TargetDatabase.dbo.MasterTable OFF
INSERT INTO TargetDatabase.dbo.ChildTable (id{, othercolumns...})
SELECT id+#incr{, othercolumns...}
FROM SourceDatabase.dbo.ChildTable
COMMIT TRAN

Is there a way to quickly duplicate record in T-SQL?

I need to duplicate selected rows with all the fields exactly same except ID ident int which is added automatically by SQL.
What is the best way to duplicate/clone record or records (up to 50)?
Is there any T-SQL functionality in MS SQL 2008 or do I need to select insert in stored procedures ?
The only way to accomplish what you want is by using Insert statements which enumerate every column except the identity column.
You can of course select multiple rows to be duplicated by using a Select statement in your Insert statements. However, I would assume that this will violate your business key (your other unique constraint on the table other than the surrogate key which you have right?) and require some other column to be altered as well.
Insert MyTable( ...
Select ...
From MyTable
Where ....
If it is a pure copy (minus the ID field) then the following will work (replace 'NameOfExistingTable' with the table you want to duplicate the rows from and optionally use the Where clause to limit the data that you wish to duplicate):
SELECT *
INTO #TempImportRowsTable
FROM (
SELECT *
FROM [NameOfExistingTable]
-- WHERE ID = 1
) AS createTable
-- If needed make other alterations to the temp table here
ALTER TABLE #TempImportRowsTable DROP COLUMN Id
INSERT INTO [NameOfExistingTable]
SELECT * FROM #TempImportRowsTable
DROP TABLE #TempImportRowsTable
If you're able to check the duplication condition as rows are inserted, you could put an INSERT trigger on the table. This would allow you to check the columns as they are inserted instead of having to select over the entire table.