T-SQL, Get distinct values from column in source, check target, insert if not exist - tsql

I've seen several somewhat similar questions, but nothing exactly like mine. A T-SQL god should be able to answer this is a flash.
Source table (feeder) of employees and department codes from HRMS system feed. I have an employees table and a departments table in my SQL Server database. I need to have a stored proc that will first get a distinct list of department codes in the feeder table, then check those codes to see if they exist in the departments table. If they don't then insert into the departments table from the feeder table. Then, do the insert into the employee table.
Right now I have found that one of the business analysts has been getting separate list of departments in Excel and adding them manually. Seems crazy when the data is already coming into the feeder table from HRMS. I can do the inserts, but I don't know how to loop through feeder table to see if the department code in each row exists in the departments table already. Thanks for any assistance.
J.

You can use the merge keyword in SQL 2008 or greater. I just tested this snippet:
merge department as d
using feeder as f on f.code = d.code
when not matched then
insert (code)
values (f.code);

Merge will work. Since we're just doing inserts though, this will too:
INSERT
department
(
departmentCode
)
SELECT departmentcode
FROM
(
SELECT
departmentcode
FROM
feeder
EXCEPT
SELECT
departmentcode
FROM
department
) c;
With indexes on departmentcode in both tables, this pattern usually performs well.

Related

Merging postgres data

I have data in two postgresql databases that needs to be merged into 1. Just to be clear, both databases have "good" data in them from a certain date that needs to be combined. This isn't merely appending the data from one into another. In other words, let's say that table foo has an serial id field. Both databases have a foo with ID=5555 and both values are valid (but different). So, the target database's foo keeps 5555 and the new record should get added with a new ID of nextval(foo_id_seq).
So, it's a big mess.
My thoughts are to create a tmp schema in the target db and to copy the needed data from source db. Then I need to essentially "upsert" the data. New records get inserted with new ideas (and foreign keys updated) and records that exist in both dbs get updated.
I don't believe there is a tool that will help me with this.
My questions.
How best to handle generating the new id? I know I could do it via selects and just leaving out the id column, but that's a lot of typing and would be slow. My thinking is to create a temporary trigger for these tables that will override the id supplied when doing an insert.
Finally notes:
Both databases are offline. And I'm the only one that can get to them.
Both database have the exact same schema
Target database is 9.2
try using something like:
INSERT INTO A(id, f1, f2)
SELECT nextval('A_seq'), tmp_A.f1, tmp_A.f2
FROM tmp_A
WHERE tmp_A.id IN (select A.id FROM A);
INSERT INTO A(id, f1, f2)
SELECT tmp_A.id, tmp_A.f1, tmp_A.f2
FROM tmp_A
WHERE tmp_A.id NOT IN (select A.id FROM A);
The idea - use one INSERT .. SELECT .. to insert the data with conflicts in id fields and other INSERT .. SELECT .. to insert the data without the conflict.
Or simply generate new id for every inserted record:
INSERT INTO A(id, f1, f2)
SELECT nextval('A_seq'), tmp_A.f1, tmp_A.f2
FROM tmp_A;

SQL Server - How to find a records in INSERTED when the database generates a primary key

I've never had to post a question on StackOverflow before because I can always find an answer here by just searching. Only this time, I think I've got a real stumper....
I'm writing code that automates the process of moving data from one SQL Server database to another. I have some pretty standard SQL Server Databases with foreign key relationships between some of their tables. Straight forward stuff. One of my requirements is that the entire table needs to be copied in one fell swoop, without looping through rows or using a cursor. Another requirement is I have to do this in SQL, no SSIS or other external helpers.
For example:
INSERT INTO TargetDatabase.dbo.MasterTable
SELECT * FROM SourceDatabase.dbo.MasterTable
That's easy enough. Then, once the data from the MasterTable has been moved, I move the data of the child table.
INSERT INTO TargetDatabase.dbo.ChildTable
SELECT * FROM SourceDatabase.dbo.ChildTable
Of course, in reality I use more explicit SQL... like I specifically name all the fields and things like that, but this is just a simplified version. Anyway, so far everything's going alright, except ...
The problem is that the primary key of the master table is defined as an identity field. So, when I insert into the MasterTable, the primary key for the new table gets calculated by the database. So to deal with that, I tried using the OUTPUT INTO statement to get the updated values into a Temp table:
INSERT INTO TargetDatabase.dbo.MasterTable
OUPUT INSERTED.* INTO #MyTempTable
SELECT * FROM SourceDatabase.dbo.MasterTable
So here's where it all falls apart. Since the database changed the primary key, how on earth do I figure out which record in the temp table matches up with the original record in the source table?
Do you see the problem? I know what the new ID is, I just don't know how to match it with the original record reliably. The SQL server lets me output the INSERTED values, but doesn't let me output the FROM TABLE values along side the INSERTED values. I've tried it with triggers, I've tried it with an SP, always I have the same problem.
If I were just updating one record at a time, I could easily match up my INSERTED values with the original record I was trying to insert to see the old and new primary key values, but I have this requirement to do it in a batch.
Any Ideas?
PS: I'm not allowed to change the table structure of the target or source table.
You can use MERGE.
declare #Source table (SourceID int identity(1,2), SourceName varchar(50))
declare #Target table (TargetID int identity(2,2), TargetName varchar(50))
insert into #Source values ('Row 1'), ('Row 2')
merge #Target as T
using #Source as S
on 0=1
when not matched then
insert (TargetName) values (SourceName)
output inserted.TargetID, S.SourceID;
Result:
TargetID SourceID
----------- -----------
2 1
4 3
Covered in this blog post by Adam Machanic: Dr. OUTPUT or: How I Learned to Stop Worrying and Love the MERGE
To illustrate what I mentioned in the comment:
SET IDENTITY_INSERT TargetDatabase.dbo.MasterTable ON
INSERT INTO TargetDatabase.dbo.MasterTable (IdentityColumn, OtherColumn1, OtherColumn2, ...)
SELECT IdentityColumn, OtherColumn1, OtherColumn2, ...
FROM SourceDatabase.dbo.MasterTable
SET IDENTITY_INSERT TargetDatabase.dbo.MasterTable OFF
Okay, since that didn't work for you (pre-existing values in target tables), how about adding a fixed increment (offset) to the id values in both tables (use the current max id value). Assuming the identity column is "id" in both tables:
DECLARE #incr int
BEGIN TRAN
SELECT #incr = max(id)
FROM TargetDatabase.dbo.MasterTable AS m WITH (TABLOCKX, HOLDLOCK)
SET IDENTITY_INSERT TargetDatabase.dbo.MasterTable ON
INSERT INTO TargetDatabase.dbo.MasterTable (id{, othercolumns...})
SELECT id+#incr{, othercolumns...}
FROM SourceDatabase.dbo.MasterTable
SET IDENTITY_INSERT TargetDatabase.dbo.MasterTable OFF
INSERT INTO TargetDatabase.dbo.ChildTable (id{, othercolumns...})
SELECT id+#incr{, othercolumns...}
FROM SourceDatabase.dbo.ChildTable
COMMIT TRAN

Cascade new IDs to a second, related table

I have two tables, Contacts and Contacts_Detail. I am importing records into the Contacts table and need to run a SP to create a record in the Contacts_Detail table for each new record in the Contacts. There is an ID in the Contacts table and a matching ID_D in the Contacts_Detail table.
I'm using this to insert the record into Contacts_Detail but get the 'Subquery returned more than 1 value.' error and I can't figure out why. There are multiple records in Contacts that need have matching records in Contacts_Detail.
Insert into Contacts_Detail (ID_D)
select id from Contacts c
left join Contacts_Detail cd
on c.id = cd.id_d
where id_d is null
I'm open to a better way...
thanks.
It sounds like you're inserting blank child-records into your Contacts_Detail table -- so the first question I'd ask is: Why?
As for why your specific SQL isn't working...
A few things you can check:
Contacts table -- do you have any records there WHERE id is null?
(delete them -- then make the id field a primary key)
Contacts_Detail
table -- do you have any records there WHEERE id_d is null?
(delete them -- then go into your designer and create a relationship
/ enforce referential integrity.)
Verify that c.id is the primary
key, and cd.id_d is the correct foreign key to relate the tables.
Hope that helps
Why not just have a trigger? This seems a little simpler than having to determine for all time which rows are missing - that seems more like something you would do periodically to correct for some anomalies, not something you should have to do after every insert. Something like this should work:
CREATE TRIGGER dbo.NewContacts
ON dbo.Contacts
FOR INSERT
AS
BEGIN
INSERT dbo.Contacts_Detail(ID_D) SELECT ID FROM inserted;
END
GO
But I suspect you have a trigger on the Contacts_Detail table that is not written to correctly handle multi-row inserts, and that's where your subquery error is coming from. Can you show the trigger on Contacts_Detail?

Insert into table with Identity and foreign key columns

I was trying to insert values from one table to another from two different databases.
My issue is I have two tables with a relation and the first table is having an identity column also.
eg table first(id, Name) - table second(id, address)
So now both the table exist with values in a db and i am trying to copy values from this db to another db.
So when I insert values from first db to second db the the first table will insert values for the Id column by itself so now I have to link that id to the second table.
How can I do that?
UPDATE using MSSQL server 2000
You can use #scope_identity immediately after your insert in SQL server 2000 which will give you the last id within the current scope but I'm not sure how that would work with bulk inserting of data
http://msdn.microsoft.com/en-us/library/ms190315.aspx
If this were SQL Server 2005 or later I would suggest using the output clause in your insert statement to retrieve the ids just inserted, but that was not available in SQL Server 2000.
If your data contains some column or series of columns which is unique other than the identity column, then you can query your first table based on that series of columns to get the ids and use that to populate your second table.
If the target tables were empty you could use SET IDENTITY_INSERT ON - this would allow to insert original values to identity columns, and you will not have to update referenced IDs. Of course if there is any existing ids that can overlap inserted ids - that is not the solution.
If names in first tables are unique, you could boild mapping between new and old ids and perform update something like this:
UPDATE S
SET S.id = F.id
FROM second S
INNER JOIN first_original FO ON FO.id = S.id
INNER JOIN first F ON F.name = FO.name
If names are not unique, then original ids should be saved in "first" in order to provide mapping between old and new ids. It can be temporary new column that can be deleted after ids in "second" will be updated.
Or as Rich Andrews said you could use #scope_identity, but in this case you will have to perform insert one by one - declare a cursor on source table, insert each record, get its new id and insert it into "second" table.

Replace Text of a field from a different table in SQL

I have two data buckets using a cryptic naming convention.
Am I able to update the data in the fields on the main table where the record entry is equal to the primary key on the other table?
Something like Table1 has 5 columns, t1A t1B t1C t1D t1E
and Table2 has 2 columns description, and Table1code.
Am I able to switch the data in Table1 with the description field in Table2?
I have tried doing a sql update/case statement but kept getting non-boolean errors when I would run it.
Any help would be appreciated.
You need to do an update with a join. Have a read of this http://www.bennadel.com/blog/938-Using-A-SQL-JOIN-In-A-SQL-UPDATE-Statement-Thanks-John-Eric-.htm