Converting Traditional IF EXIST UPDATE ELSE INSERT into MERGE is not working? - tsql

I am going to use MERGE to insert or update a table depending upon ehether it's exist or not. This is my query,
declare #t table
(
id int,
name varchar(10)
)
insert into #t values(1,'a')
MERGE INTO #t t1
USING (SELECT id FROM #t WHERE ID = 2) t2 ON (t1.id = t2.id)
WHEN MATCHED THEN
UPDATE SET name = 'd', id = 3
WHEN NOT MATCHED THEN
INSERT (id, name)
VALUES (2, 'b');
select * from #t;
The result is,
id name
1 a
I think it should be,
id name
1 a
2 b

You have your USING part slightly messed up, that's where to put what you want to match against (although in this case you're only using id)
declare #t table
(
id int,
name varchar(10)
)
insert into #t values(1,'a')
MERGE INTO #t t1
USING (SELECT 2, 'b') AS t2 (id, name) ON (t1.id = t2.id)
WHEN MATCHED THEN
UPDATE SET name = 'd', id = 3
WHEN NOT MATCHED THEN
INSERT (id, name)
VALUES (2, 'b');
select * from #t;

As Mikhail pointed out, your query in the USING clause doesn't contain any rows.
If you want to do an upsert, put the new data into the USING clause:
MERGE INTO #t t1
USING (SELECT 2 as id, 'b' as name) t2 ON (t1.id = t2.id) --This no longer has an artificial dependency on #t
WHEN MATCHED THEN
UPDATE SET name = t2.name
WHEN NOT MATCHED THEN
INSERT (id, name)
VALUES (t2.id, t2.name);

This query won't return anything:
SELECT id FROM #t WHERE ID = 2
Because where is no rows in table with ID = 2, so there is nothing to merge into table.
Besides, in MATCHED clause you are updating a field ID on which you are joining table, i think, it's forbidden.

For each DML operations you have to commit (Marks the end of a successful the transaction)Then only you will be able to see the latest data
For example :
GO
BEGIN TRANSACTION;
GO
DELETE FROM HumanResources.JobCandidate
WHERE JobCandidateID = 13;
GO
COMMIT TRANSACTION;
GO

Related

Using cte with update in db2

I have set of queries which are getting repeated in update query for db2
I need to get rid of these getting repeated
So what best I can do. Cte is not working in update query in db2.
Not able to find solution. For select query cte is working but not for update
You may use so called "SELECT from data-change-table-reference" Db2 functionality using CTE.
CREATE TABLE MYTAB (ID INT, DATA INT);
INSERT INTO MYTAB (ID, DATA) VALUES (1, NULL);
WITH
UPD (ID, DATA) AS (VALUES (1, 1), (2, 2))
SELECT COUNT (1)
FROM NEW TABLE
(
UPDATE MYTAB T
SET DATA = U.DATA
FROM UPD U
WHERE T.ID = U.ID
);
SELECT * FROM MYTAB;
ID
DATA
1
1
fiddle

The difference between PRINT ##ROWCOUNT and OUTPUT $ACTION in sql server

Apologies if my question seems to be naive:
I cannot get my head around the 2 statements below, can someone please explain the difference:
OUTPUT $ACTION, INSERTED.BuildRequestID, ..... and
PRINT ##ROWCOUNT
Apparently, they both can be used to print something on the window, with output in the example above, the records that have been inserted will be displayed. And, PRINT ##ROWCOUNT returns the number of rows affected by the last executed statement in the batch, so, if the function was insert, then it will show the inserted records?
Thank you,
In its simplest terms, OUTPUT will give you the actual records affected by a DML statement (INSERT, UPDATE, DELETE, MERGE), ##ROWCOUNT will just tell you how many rows were affected by the previous Statement (not limited to DML).
This is probably easiest understood with a working example that you can run yourself and see both in action:
IF OBJECT_ID(N'tempdb..#T', 'U') IS NOT NULL
DROP TABLE #T;
-- CHECK ##ROWCOUNT
DECLARE #RowCountFromDropTable INT = ##ROWCOUNT;
-- CREATE A TABLE
CREATE TABLE #T (ID INT NOT NULL PRIMARY KEY, Col CHAR(1) NOT NULL);
-- INSERT SOME VALUES AND CHECK THE OUTPUT
INSERT #T (ID, Col)
OUTPUT inserted.*
VALUES (1, 'A'), (2, 'B'), (3, 'C');
-- CHECK ##ROWCOUNT
DECLARE #RowCountFromInsert INT = ##ROWCOUNT;
-- DELETE A VALUE AND INSPECT THE DELETED RECORD WITH OUTPUT
DELETE #T
OUTPUT deleted.*
WHERE ID = 3;
-- CHECK ##ROWCOUNT
DECLARE #RowCountFromDelete INT = ##ROWCOUNT;
-- UPDATE A RECORD AND VIEW BEFORE AND AFTER VALUES
UPDATE #T
SET Col = 'X'
OUTPUT inserted.ID AS ID,
inserted.Col AS UpdatedTo,
deleted.Col AS UpdatedFrom
WHERE ID = 2;
-- CHECK ##ROWCOUNT
DECLARE #RowCountFromUpdate INT = ##ROWCOUNT;
-- USE MERGE, AND CAPTURE ACTION:
MERGE #T AS t
USING (VALUES (2, 'B'), (3, 'C')) AS s (ID, Col)
ON s.ID = t.ID
WHEN NOT MATCHED THEN INSERT (ID, Col) VALUES (s.ID, s.Col)
WHEN MATCHED THEN UPDATE SET Col = s.Col
WHEN NOT MATCHED BY SOURCE THEN DELETE
OUTPUT $Action AS DMLAction,
inserted.ID AS InsertedID,
inserted.Col AS InsertedCol,
deleted.ID AS DeletedID,
deleted.Col AS DeletedCol;
-- CHECK ##ROWCOUNT
DECLARE #RowCountFromMerge INT = ##ROWCOUNT;
SELECT RowCountFromDropTable = #RowCountFromDropTable,
RowCountFromInsert = #RowCountFromInsert,
RowCountFromDelete = #RowCountFromDelete,
RowCountFromUpdate = #RowCountFromUpdate,
RowCountFromMerge = #RowCountFromMerge;
The recordsets output from each of the DML are:
INSERT
ID Col
-------
1 A
2 B
3 C
DELETE
ID Col
-------
3 C
UPDATE
ID UpdatedTo UpdatedFrom
---------------------------
2 X B
MERGE
DMLAction InsertedID InsertedCol DeletedID DeletedCol
------------------------------------------------------------
INSERT 3 C NULL NULL
DELETE NULL NULL 1 A
UPDATE 2 B 2 X
INSPECT ##ROWCOUNTS
RowCountFromDropTable RowCountFromInsert RowCountFromUpdate RowCountFromMerge
--------------------------------------------------------------------------------
0 3 1 3
A quick point on some wording in the qeustion too: You cannot use OUTPUT directly to print something to the window, it returns records much like a SELECT statement. ##ROWCOUNT can be used like any scalar function, so you could use this in consecutive statements. So you could do something like this:
SELECT TOP (1) *
FROM (VALUES (1), (2), (3)) AS t (ID);
SELECT TOP (##ROWCOUNT + 1) *
FROM (VALUES (1), (2), (3)) AS t (ID);
SELECT TOP (##ROWCOUNT + 1) *
FROM (VALUES (1), (2), (3)) AS t (ID);
Which returns 1, 1,2 and 1,2,3 respectively. I have no idea why you would want to do this, but it demonstrates the scope of ##ROWCOUNT a bit better than the above, and how it can be used elsewhere.

Select value from an enumerated list in PostgreSQL

I want to select from an enumaration that is not in database.
E.g. SELECT id FROM my_table returns values like 1, 2, 3
I want to display 1 -> 'chocolate', 2 -> 'coconut', 3 -> 'pizza' etc. SELECT CASE works but is too complicated and hard to overview for many values. I think of something like
SELECT id, array['chocolate','coconut','pizza'][id] FROM my_table
But I couldn't succeed with arrays. Is there an easy solution? So this is a simple query, not a plpgsql script or something like that.
with food (fid, name) as (
values
(1, 'chocolate'),
(2, 'coconut'),
(3, 'pizza')
)
select t.id, f.name
from my_table t
join food f on f.fid = t.id;
or without a CTE (but using the same idea):
select t.id, f.name
from my_table t
join (
values
(1, 'chocolate'),
(2, 'coconut'),
(3, 'pizza')
) f (fid, name) on f.fid = t.id;
This is the correct syntax:
SELECT id, (array['chocolate','coconut','pizza'])[id] FROM my_table
But you should create a referenced table with those values.
What about creating another table that enumerate all cases, and do join ?
CREATE TABLE table_case
(
case_id bigserial NOT NULL,
case_name character varying,
CONSTRAINT table_case_pkey PRIMARY KEY (case_id)
)
WITH (
OIDS=FALSE
);
and when you select from your table:
SELECT id, case_name FROM my_table
inner join table_case on case_id=my_table_id;

Getting those records that have a match of all records in another table

Within the realm of this problem I have 3 entities:
User
Position
License
Then I have two relational (many-to-many) tables:
PositionLicense - this one connects Position with License ie. which licenses are required for a particular position
UserLicense - this one connects User with License ie. which licenses a particular user has. But with an additional complexity: user licenses have validity date range (ValidFrom and ValidTo)
The problem
These are input variables:
UserID that identifiers a particular User
RangeFrom defines the lower date range limit
RangeTo defines the upper date range limit
What I need to get? For a particular user (and date range) I need to get a list of positions that this particular user can work at. The problem is that user must have at least all licenses required by every matching position.
I'm having huge problems writing a SQL query to get this list.
If at all possible I would like to do this using a single SQL query (can have additional CTEs of course). If you can convince me that doing it in several queries would be more efficient I'm willing to listen in.
Some workable data
Copy and runs this script. 3 users, 3 positions, 6 licenses. Mark and John should have a match but not Jane.
create table [User] (
UserID int identity not null
primary key,
Name nvarchar(100) not null
)
go
create table Position (
PositionID int identity not null
primary key,
Name nvarchar(100) not null
)
go
create table License (
LicenseID int identity not null
primary key,
Name nvarchar(100) not null
)
go
create table UserLicense (
UserID int not null
references [User](UserID),
LicenseID int not null
references License(LicenseID),
ValidFrom date not null,
ValidTo date not null,
check (ValidFrom < ValidTo),
primary key (UserID, LicenseID)
)
go
create table PositionLicense (
PositionID int not null
references Position(PositionID),
LicenseID int not null
references License(LicenseID),
primary key (PositionID, LicenseID)
)
go
insert [User] (Name) values ('Mark the mechanic');
insert [User] (Name) values ('John the pilot');
insert [User] (Name) values ('Jane only has arts PhD but not medical.');
insert Position (Name) values ('Mechanic');
insert Position (Name) values ('Pilot');
insert Position (Name) values ('Doctor');
insert License (Name) values ('Mecha');
insert License (Name) values ('Flying');
insert License (Name) values ('Medicine');
insert License (Name) values ('PhD');
insert License (Name) values ('Phycho');
insert License (Name) values ('Arts');
insert PositionLicense (PositionID, LicenseID) values (1, 1);
insert PositionLicense (PositionID, LicenseID) values (2, 2);
insert PositionLicense (PositionID, LicenseID) values (2, 5);
insert PositionLicense (PositionID, LicenseID) values (3, 3);
insert PositionLicense (PositionID, LicenseID) values (3, 4);
insert UserLicense (UserID, LicenseID, ValidFrom, ValidTo) values (1, 1, '20110101', '20120101');
insert UserLicense (UserID, LicenseID, ValidFrom, ValidTo) values (2, 2, '20110101', '20120101');
insert UserLicense (UserID, LicenseID, ValidFrom, ValidTo) values (2, 5, '20110101', '20120101');
insert UserLicense (UserID, LicenseID, ValidFrom, ValidTo) values (3, 4, '20110101', '20120101');
insert UserLicense (UserID, LicenseID, ValidFrom, ValidTo) values (3, 6, '20110101', '20120101');
Resulting solution
I've setup my resulting solution based on accepted answer which provides the most simplified solution to this problem. If you'd like to play with the query just hit edit/clone (whether you're logged in or not). What can be changed:
three variables:
two variable to set date range (#From and #To)
user ID (#User)
you can toggle commented code in the first CTE to switch code between fully overlapping user licenses or partially overlapping ones.
This makes a number of assumptions (ignores presence of time in the datetime columns, assumes fairly obvious primary keys) and skips the joins to pull in user name, position details, and the like. (And you implied that the user had to hold all the licenses for the full period specified, right?)
SELECT pl.PositionId
from PositionLicense pl
left outer join (-- All licenses user has for the entirety (sp?) of the specified date range
select LicenseId
from UserLicense
where UserId = #UserId
and #RangeFrom <= ValidFrom
and #RangeTo >= ValidTo) li
on li.LicenseId = pl.LicenseId
group by pl.PositionId
-- Where all licenses required by position are held by user
having count(pl.LicenseId) = count(li.LicenseId)
No data so I can't debug or test it, but this or something very close to it should do the trick.
Select ...
From User As U
Cross Join Position As P
Where Exists (
Select 1
From PositionLicense As PL1
Join UserLicense As UL1
On UL1.LicenseId = PL1.LicenseId
And UL1.ValidFrom <= #RangeTo
And UL1.ValidTo >= #RangeFrom
Where PL1.PositionId = P.Id
And UL1.UserId = U.Id
Except
Select 1
From PositionLicense As PL2
Left Join UserLicense As UL2
On UL2.LicenseId = PL2.LicenseId
And UL2.ValidFrom <= #RangeTo
And UL2.ValidTo >= #RangeFrom
And UL2.UserId = U.Id
Where PL2.PositionId = P.Id
And UL2.UserId Is Null
)
If the requirement is that you want users and positions that are valid across the entire range, that is trickier:
With Calendar As
(
Select #RangeFrom As [Date]
Union All
Select DateAdd(d, 1, [Date])
From Calendar
Where [Date] <= #RangeTo
)
Select ...
From User As U
Cross Join Position As P
Where Exists (
Select 1
From UserLicense As UL1
Join PositionLicense As PL1
On PL1.LicenseId = UL1.LicenseId
Where UL1.UserId = U.Id
And PL1.PositionId = P.Id
And UL1.ValidFrom <= #RangeTo
And UL1.ValidTo >= #RangeFrom
Except
Select 1
From Calendar As C1
Cross Join User As U1
Cross Join PositionLicense As PL1
Where U1.Id = U.Id
And PL1.PositionId = P.Id
And Not Exists (
Select 1
From UserLicense As UL2
Where UL2.LicenseId = PL1.LicenseId
And UL1.UserId = U1.Id
And C1.Date Between UL2.ValidFrom And UL2.ValidTo
)
)
Option ( MaxRecursion 0 );
Runnable Version Here
WITH PositionRequirements AS (
SELECT p.PositionID, COUNT(*) AS LicenseCt
FROM #Position AS p
INNER JOIN #PositionLicense AS posl
ON posl.PositionID = p.PositionID
GROUP BY p.PositionID
)
,Satisfied AS (
SELECT u.UserID, posl.PositionID, COUNT(*) AS LicenseCt
FROM #User AS u
INNER JOIN #UserLicense AS perl
ON perl.UserID = u.UserID
-- AND #Date BETWEEN perl.ValidFrom AND perl.ValidTo
AND '20110101' BETWEEN perl.ValidFrom AND perl.ValidTo
INNER JOIN #PositionLicense AS posl
ON posl.LicenseID = perl.LicenseID
-- WHERE u.UserID = #UserID -- Not strictly necessary, we can go over all people
GROUP BY u.UserID, posl.PositionID
)
SELECT PositionRequirements.PositionID, Satisfied.UserID
FROM PositionRequirements
INNER JOIN Satisfied
ON Satisfied.PositionID = PositionRequirements.PositionID
AND PositionRequirements.LicenseCt = Satisfied.LicenseCt
You could probably turn this into an inline table-valued function parameterized on effective date.

TSQL: Remove duplicates based on max(date)

I am searching for a query to select the maximum date (a datetime column) and keep its id and row_id. The desire is to DELETE the rows in the source table.
Source Data
id date row_id(unique)
1 11/11/2009 1
1 12/11/2009 2
1 13/11/2009 3
2 1/11/2009 4
Expected Survivors
1 13/11/2009 3
2 1/11/2009 4
What query would I need to achieve the results I am looking for?
Tested on PostgreSQL:
delete from table where (id, date) not in (select id, max(date) from table group by id);
There are various ways of doing this, but the basic idea is the same:
- Indentify the rows you want to keep
- Compare each row in your table to the ones you want to keep
- Delete any that don't match
DELETE
[source]
FROM
yourTable AS [source]
LEFT JOIN
yourTable AS [keep]
ON [keep].id = [source].id
AND [keep].date = (SELECT MAX(date) FROM yourTable WHERE id = [keep].id)
WHERE
[keep].id IS NULL
DELETE
[yourTable]
FROM
[yourTable]
LEFT JOIN
(
SELECT id, MAX(date) AS date FROM yourTable GROUP BY id
)
AS [keep]
ON [keep].id = [yourTable].id
AND [keep].date = [yourTable].date
WHERE
[keep].id IS NULL
DELETE
[source]
FROM
yourTable AS [source]
WHERE
[source].row_id != (SELECT TOP 1 row_id FROM yourTable WHERE id = [source].id ORDER BY date DESC)
DELETE
[source]
FROM
yourTable AS [source]
WHERE
NOT EXISTS (SELECT id FROM yourTable GROUP BY id HAVING id = [source].id AND MAX(date) != [source].date)
Because you are using SQL Server 2000, you'er not able to use the Row Over technique of setting up a sequence and to identify the top row for each unique id.
So, your proposed technique is to use a datetime column to get the top 1 row to remove duplicates. That might work, but there is a possibility that you might still get duplicates having the same datetime value. But that's easy enough to check for.
First check the assumption that all rows are unique based on the id and date columns:
CREATE TABLE #TestTable (rowid INT IDENTITY(1,1), thisid INT, thisdate DATETIME)
INSERT INTO #TestTable (thisid,thisdate) VALUES (1, '11/11/2009')
INSERT INTO #TestTable (thisid,thisdate) VALUES (1, '12/11/2009')
INSERT INTO #TestTable (thisid,thisdate) VALUES (1, '12/12/2009')
INSERT INTO #TestTable (thisid,thisdate) VALUES (2, '1/11/2009')
INSERT INTO #TestTable (thisid,thisdate) VALUES (2, '1/11/2009')
SELECT COUNT(*) AS thiscount
FROM #TestTable
GROUP BY thisid, thisdate
HAVING COUNT(*) > 1
This example returns a value of 2 - indicating that you will still end up with duplicates even after using the date column to remove duplicates. If you return 0, then you have proven that your proposed technique will work.
When de-duping production data, I think one should take some precautions and test before and after. You should create a table to hold the rows you plan to remove so you can recover them easily if you need to after the delete statement has been executed.
Also, it's a good idea to know beforehand how many rows you plan to remove so you can verify the count before and after - and you can gauge the magnitude of the delete operation. Based on how many rows will be affected, you can plan when to run the operation.
To test before the de-duping process, find the occurrences.
-- Get occurrences of duplicates
SELECT COUNT(*) AS thiscount
FROM
#TestTable
GROUP BY thisid
HAVING COUNT(*) > 1
ORDER BY thisid
That gives you the rows with more than one row with the same id. Capture the rows from this query into a temporary table and then run a query using the SUM to get the total number of rows that are not unique based on your key.
To get the number of rows you plan to delete, you need the count of rows that are duplicate based on your unique key, and the number of distinct rows based on your unique key. You subtract the distinct rows from the count of occurrences. All that is pretty straightforward - so I'll leave you to it.
Try this
declare #t table (id int, dt DATETIME,rowid INT IDENTITY(1,1))
INSERT INTO #t (id,dt) VALUES (1, '11/11/2009')
INSERT INTO #t (id,dt) VALUES (1, '11/12/2009')
INSERT INTO #t (id,dt) VALUES (1, '11/13/2009')
INSERT INTO #t (id,dt) VALUES (2, '11/01/2009')
Query:
delete from #t where rowid not in(
select t.rowid from #t t
inner join(
select MAX(dt)maxdate
from #t
group by id) X
on t.dt = X.maxdate )
select * from #t
Output:
id dt rowid
1 2009-11-13 00:00:00.000 3
2 2009-11-01 00:00:00.000 4
delete from temp where row_id not in (
select t.row_id from temp t
right join
(select id,MAX(dt) as dt from temp group by id) d
on t.dt = d.dt and t.id = d.id)
I have tested this answer..
INSERT INTO #t (id,dt) VALUES (1, '11/11/2009')
INSERT INTO #t (id,dt) VALUES (1, '11/12/2009')
INSERT INTO #t (id,dt) VALUES (1, '11/13/2009')
INSERT INTO #t (id,dt) VALUES (2, '11/01/2009')
select * from #t
;WITH T AS(
select dense_rank() over(partition by id order by dt desc)NO,DT,ID,rowid from #t )
DELETE T WHERE NO>1