Compare rows within same table in SQL Server 2012.
Given a table (see script below), what would be the best way to compare rows (individual columns) within the same table?
Example
We are importing data from a client and this can be done many times and we should be able to detect the difference in column's values.
So each time we get the data and we import it we increment our "DataCut" column and then I should be able to compare the difference between datacuts.
If you notice in my sample data in datacutId 3 the Name-Postcode-Homephone have changed.
How would I report on these differences? Any snippet sql or line of thoughts?
Many thanks
SQL script to create test environment
IF EXISTS (SELECT * FROM sys.databases WHERE name='TestDatabase')
BEGIN
ALTER DATABASE TestDatabase
SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
DROP DATABASE TestDatabase
END
CREATE DATABASE TestDatabase collate Latin1_General_CI_AS
GO
ALTER DATABASE TestDatabase SET RECOVERY SIMPLE
BEGIN TRANSACTION
USE TestDatabase
IF OBJECT_ID(N'[dbo].[Customer]', 'U') IS NOT NULL
DROP TABLE [dbo].[Customer];
GO
CREATE TABLE [dbo].[Customer](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[DataCutId] [int] NOT NULL,
[CustomerId] [int] NOT NULL,
[Name] [varchar](50) NULL,
[Surname] [varchar](50) NULL,
[City] [varchar](255) NULL,
[PostCode] [varchar](10) NULL,
[HomePhone] [varchar](50) NULL,
CONSTRAINT [PK_Customer] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET IDENTITY_INSERT [dbo].[Customer] ON;
INSERT INTO [dbo].[Customer]([Id], [DataCutId],CustomerId, [Name], [Surname], [City], [PostCode], [HomePhone])
SELECT 1, 1,20, N'Jo', N'Bloggs', N'London', N'aaa 342', N'0207 3456785' UNION ALL
SELECT 2, 2, 20,N'Jo', N'Bloggs', N'London', N'aaa 342', N'0207 3456785' UNION ALL
SELECT 3, 3, 20,N'Mark', N'Bloggs', N'Londong', N'bbb d4543', N'0208 3456785'
SET IDENTITY_INSERT [dbo].[Customer] OFF;
COMMIT
Would this give you a start to solving your problem?
I am sure there might be an easier way maybe?
select a.*, b.DataCutId as UpdDataCutID, b.Name as UpdName, b.Surname as UpdSurname, b.City as UpdCity, b.PostCode as UpdPostCode, b.HomePhone as UpdHomePhone
from Customer a inner join Customer b on a.CustomerId = b.CustomerId
and a.Id < b.ID and a.Name <> b.Name
union
select a.*, b.DataCutId, b.Name, b.Surname, b.City, b.PostCode, b.HomePhone
from Customer a inner join Customer b on a.CustomerId = b.CustomerId
and a.Id < b.ID and a.Surname <> b.Surname
union
select a.*, b.DataCutId, b.Name, b.Surname, b.City, b.PostCode, b.HomePhone
from Customer a inner join Customer b on a.CustomerId = b.CustomerId
and a.Id < b.ID and a.City <> b.City
union
select a.*, b.DataCutId, b.Name, b.Surname, b.City, b.PostCode, b.HomePhone
from Customer a inner join Customer b on a.CustomerId = b.CustomerId
and a.Id < b.ID and a.PostCode <> b.PostCode
union
select a.*, b.DataCutId, b.Name, b.Surname, b.City, b.PostCode, b.HomePhone
from Customer a inner join Customer b on a.CustomerId = b.CustomerId
and a.Id < b.ID and a.HomePhone <> b.HomePhone
EDIT I've pulled this all together in a single query below
select a.*, b.DataCutId as UpdDataCutID, b.Name as UpdName, b.Surname as UpdSurname, b.City as UpdCity, b.PostCode as UpdPostCode, b.HomePhone as UpdHomePhone
from Customer a inner join Customer b on a.CustomerId = b.CustomerId
and a.Id < b.ID and
(a.Name <> b.Name OR
a.Surname <> b.Surname OR
a.City <> b.City OR
a.PostCode <> b.PostCode OR
a.HomePhone <> b.HomePhone)
Related
I need to get a tree of related nodes given a certain node, but not necessary top node. I've got a solution using two CTEs, since I am struggling to squeeze it all into one CTE :). Might somebody have a sleek solution to avoid using two CTEs? Here is some code that I was playing with:
DECLARE #temp AS TABLE (ID INT, ParentID INT)
INSERT INTO #temp
SELECT 1 ID, NULL AS ParentID
UNION ALL
SELECT 2, 1
UNION ALL
SELECT 3, 2
UNION ALL
SELECT 4, 3
UNION ALL
SELECT 5, 4
UNION ALL
SELECT 6, NULL
UNION ALL
SELECT 7, 6
UNION ALL
SELECT 8, 7
DECLARE #startNode INT = 4
;WITH TheTree (ID,ParentID)
AS (
SELECT ID, ParentID
FROM #temp
WHERE ID = #startNode
UNION ALL
SELECT t.id, t.ParentID
FROM #temp t
JOIN TheTree tr ON t.ParentID = tr.ID
)
SELECT * FROM TheTree
;WITH Up(ID,ParentID)
AS (
SELECT t.id, t.ParentID
FROM #temp t
WHERE t.ID = #startNode
UNION ALL
SELECT t.id, t.ParentID
FROM #temp t
JOIN Up c ON t.id = c.ParentID
)
--SELECT * FROM Up
,TheTree (ID,ParentID)
AS (
SELECT ID, ParentID
FROM Up
WHERE ParentID is null
UNION ALL
SELECT t.id, t.ParentID
FROM #temp t
JOIN TheTree tr ON t.ParentID = tr.ID
)
SELECT * FROM TheTree
thanks
Meh. This avoids using two CTEs, but the result is a brute force kludge that hardly qualifies as "sleek" as it won’t be efficient if your table is at all sizeable. It will:
Recursively build all possible hierarchies
As you build them, flag the target NodeId as you find it
Return only the targeted tree
I threw in column “TreeNumber” on the off-chance the TargetId appears in multiple hierarchies, or if you’d ever have multiple values to check in one pass. “Depth” was added to make the output a bit more legible.
A more complex solution like #John’s might do, and more and subtler tricks could be done with more detailed table sturctures.
DECLARE #startNode INT = 4
;WITH cteAllTrees (TreeNumber, Depth, ID, ParentID, ContainsTarget)
AS (
SELECT
row_number() over (order by ID) TreeNumber
,1
,ID
,ParentID
,case
when ID = #startNode then 1
else 0
end ContainsTarget
FROM #temp
WHERE ParentId is null
UNION ALL
SELECT
tr.TreeNumber
,tr.Depth + 1
,t.id
,t.ParentID
,case
when tr.ContainsTarget = 1 then 1
when t.ID = #startNode then 1
else 0
end ContainsTarget
FROM #temp t
INNER JOIN cteAllTrees tr
ON t.ParentID = tr.ID
)
SELECT
TreeNumber
,Depth
,ID
,ParentId
from cteAllTrees
where TreeNumber in (select TreeNumber from cteAllTrees where ContainsTarget = 1)
order by
TreeNumber
,Depth
,ID
Here is a technique where you can select the entire hierarchy, a specific node with all its children, and even a filtered list and how they roll.
Note: See the comments next to the DECLAREs
Declare #YourTable table (id int,pt int,name varchar(50))
Insert into #YourTable values
(1,null,'1'),(2,1,'2'),(3,1,'3'),(4,2,'4'),(5,2,'5'),(6,3,'6'),(7,null,'7'),(8,7,'8')
Declare #Top int = null --<< Sets top of Hier Try 2
Declare #Nest varchar(25) = '|-----' --<< Optional: Added for readability
Declare #Filter varchar(25) = '' --<< Empty for All or try 4,6
;with cteP as (
Select Seq = cast(1000+Row_Number() over (Order by name) as varchar(500))
,ID
,pt
,Lvl=1
,name
From #YourTable
Where IsNull(#Top,-1) = case when #Top is null then isnull(pt,-1) else ID end
Union All
Select Seq = cast(concat(p.Seq,'.',1000+Row_Number() over (Order by r.name)) as varchar(500))
,r.ID
,r.pt
,p.Lvl+1
,r.name
From #YourTable r
Join cteP p on r.pt = p.ID)
,cteR1 as (Select *,R1=Row_Number() over (Order By Seq) From cteP)
,cteR2 as (Select A.Seq,A.ID,R2=Max(B.R1) From cteR1 A Join cteR1 B on (B.Seq like A.Seq+'%') Group By A.Seq,A.ID )
Select Distinct
A.R1
,B.R2
,A.ID
,A.pt
,A.Lvl
,name = Replicate(#Nest,A.Lvl-1) + A.name
From cteR1 A
Join cteR2 B on A.ID=B.ID
Join (Select R1 From cteR1 where IIF(#Filter='',1,0)+CharIndex(concat(',',ID,','),concat(',',#Filter+','))>0) F on F.R1 between A.R1 and B.R2
Order By A.R1
This is my T-SQL
select Id,Profile,Type ,
case Profile
when 'Soft' then 'SID'
when 'Hard' then 'HID'
end as [Profile]
from ProductDetail p1
inner join [tableA or tableB] on xxxxxxxx
I want join tableA when Profile = Soft and join tableB when Profile = Hard, how can I do just only using T-SQL in one batch?
Thanks
You can't directly do it, but could achieve the same effect with outer joins
select Id,Profile,Type ,
case Profile
when 'Soft' then 'SID'
when 'Hard' then 'HID'
end as [Profile]
from ProductDetail p1
left outer join tableA ON tableA.x = p1.x AND p1.Profile = 'Soft'
left outer join tableB ON tableB.x = p1.x AND p1.Profile = 'Hard'
where
where
(tableA.x IS NOT NULL and p1.Profile = 'Soft')
or (tableB.x IS NOT NULL and p1.Profile = 'Hard')
Of course, you can choose different tables for inner join operation, but it must be based on some condition or variable.
For Example:
select Id,Profile,Type ,
case Profile
when 'Soft' then 'SID'
when 'Hard' then 'HID'
end as [Profile]
from ProductDetail p1
inner join tableA A
on Profile='Soft'
AND <any other Condition>
UNION
select Id,Profile,Type ,
case Profile
when 'Soft' then 'SID'
when 'Hard' then 'HID'
end as [Profile]
from ProductDetail p1
inner join tableB B
on Profile='Hard'
AND <any other Condition>
You can do this in a single statement with the same or similar case statement in your join. Below is sample code using temp tables that joins to 2 different reference tables merged into a single result set using a UNION
DECLARE #ProductDetail TABLE (Id INT, sProfile VARCHAR(100), StID INT, HdID INT)
DECLARE #TableA TABLE (StId INT, Field1 VARCHAR(100))
DECLARE #TableB TABLE (HdId INT, Field1 VARCHAR(100))
INSERT INTO #ProductDetail (Id, sProfile, StID , HdID ) VALUES (1,'Soft',1,1)
INSERT INTO #ProductDetail (Id, sProfile, StID , HdID ) VALUES (2,'Hard',2,2)
INSERT INTO #TableA (StId,Field1) VALUES (1,'Soft 1')
INSERT INTO #TableA (StId,Field1) VALUES (2,'Soft 2')
INSERT INTO #TableB (HdId,Field1) VALUES (1,'Hard 1')
INSERT INTO #TableB (HdId,Field1) VALUES (2,'Hard 2')
SELECT
p1.Id,p1.sProfile,
CASE
WHEN p1.sProfile = 'Soft' THEN StID
WHEN p1.sProfile = 'Hard' THEN HdId
END AS [Profile]
,ReferenceTable.FieldName
FROM
#ProductDetail p1
INNER JOIN
(
SELECT StID AS id, 'Soft' AS sProfile, Field1 AS FieldName
FROM #TableA AS tableA
UNION ALL
SELECT HdID AS id, 'Hard' AS sProfile, Field1 AS FieldName
FROM #TableB AS tableB
)
AS ReferenceTable
ON
CASE
WHEN p1.sProfile = 'Soft' THEN StID
WHEN p1.sProfile = 'Hard' THEN HdID
END = ReferenceTable.Id
AND p1.sProfile = ReferenceTable.sProfile
This will return the following result set:
Id sProfile Profile FieldName
1 Soft 1 Soft 1
2 Hard 2 Hard 2
I'm working on a sql query that should 'coalesce' the records from 2 tables, i.e. if the record exists in table2, it should take that one, otherwise it should fall back to the values in table1.
In the example, table1 and table2 have just 2 fields (id an description), but obviously in reality there could be more.
Here's a small test case:
create table table1 (id int, description nvarchar(50))
create table table2 (id int, description nvarchar(50))
insert into table1 values (1, 'record 1')
insert into table1 values (2, 'record 2')
insert into table1 values (3, 'record 3')
insert into table2 values (1, 'record 1 modified')
insert into table2 values (2, null)
The result of the query should look like this:
1, "record 1 modified"
2, null
3, "record 3"
Here's what I came up with.
select
case when table2.id is not null then
table2.id else table1.id
end as Id,
case when table2.id is not null then
table2.description
else
table1.description
end as Description
-- etc for other fields
from table1
left join table2 on table1.id = table2.id
Is there a better way to achieve what I want? I don't think I can use coalesce since that would not select a null value from table2 if the corresponding value in table1 is not null.
How about:
SELECT t2.ID, t2.Description
FROM table2 t2
UNION ALL
SELECT t1.ID, t1.Description
FROM table1 t1
WHERE NOT EXISTS (SELECT *
FROM table2 t2
WHERE t2.ID = t1.ID)
The above query gets all the records from table 2 (including the case where description is NULL but the ID is populated), and only the records from table 1 where they don't exist in table 2.
Here's an alternative:
SELECT table2.*
FROM table1
RIGHT JOIN table2
ON table1.id = table2.id
UNION
SELECT table1.*
FROM table1
FULL OUTER join table2
ON table1.id = table2.id
WHERE table1.id NOT IN (SELECT id FROM table2)
--and table2.id not in (select id from table1)
You can add in that last line if you don't want ids that are only in table2. Otherwise I guess Stuart Ainsworth's solution is better (i.e. drop all the joins)
http://sqlfiddle.com/#!3/03bab/12/0
I'm quite stuck with this problem for sometime now..
How do I sort column A depending on the contents of Column B?
I have this sample:
ID count columnA ColumnB
-----------------------------------
12 1 A B
13 2 C D
14 3 B C
I want to sort it like this:
ID count ColumnA ColumnB
-----------------------------------
12 1 A B
14 3 B C
13 2 C D
so I need to sort the rows if the previous row of ColumnB = the next row of ColumnA
I'm thinking a loop? but can't quite imagine how it will work...
I was thinking it will go like this (maybe)
SELECT
a.ID, a.ColumnA, a.ColumnB
FROM
TableA WITH a (NOLOCK)
LEFT JOIN
TableA b WITH (NOLOCK) ON a.ID = b.ID AND a.counts = b.counts
WHERE
a.columnB = b.ColumnA
the above code isn't working though and I was thinking more on the lines of...
DECLARE #counts int = 1
DECLARE #done int = 0
--WHILE #done = 0
BEGIN
SELECT
a.ID, a.ColumnA, a.ColumnB
FROM
TableA WITH a (NOLOCK)
LEFT JOIN
TableA b WITH (NOLOCK) ON a.ID = b.ID AND a.counts = #counts
WHERE
a.columnB = b.ColumnA
set #count = #count +1
END
If this was a C code, would be easier for me but T-SQL's syntax is making it a bit harder for a noobie like me.
Any help is greatly appreciated!
Edit: sample code
drop table tablea
create table TableA(
id int,
colA varchar(10),
colb varchar(10),
counts int
)
insert INTO TableA
(id, cola, colb, counts)
select 12, 'Bad', 'Cat', 3
insert INTO TableA
(id, cola, colb, counts)
select 13, 'Apple', 'Bad', 1
insert INTO TableA
(id, cola, colb, counts)
select 14, 'Cat', 'Dog', 2
select * FROM TableA
SELECT a.ID, a.ColA, a.ColB
FROM TableA a WITH (NOLOCK)
LEFT JOIN TableA b WITH (NOLOCK)
ON a.ID = b.ID
Where a.colB = b.ColA
ORDER BY a.ColA ASC
you just need to add ORDER BY clause
-- SELECT a.ID, a.ColumnA, a.ColumnB
-- FROM TableA WITH a (NOLOCK)
-- LEFT JOIN TableA b WITH (NOLOCK)
-- ON a.ID = b.ID
-- and a.counts = b.counts
-- Where a.columnB = b.ColumnA
ORDER BY a.ColumnA ASC
This is all you need. Sometimes you have to think simple
select * from table A
order by columnA asc
I am trying to develop a T-SQL query to exclude all rows from another table "B". This other table "B" has 3 columns comprising its PK for a total of 136 rows. So I want to select all columns from table "A" minus those from table "B". How do I do this? I don't think this query is correct because I am still getting a duplicate record error:
CREATE TABLE #B (STUDENTID VARCHAR(50), MEASUREDATE SMALLDATETIME, MEASUREID VARCHAR(50))
INSERT #B
SELECT studentid, measuredate, measureid
from [J5C_Measures_Sys]
GROUP BY studentid, measuredate, measureid
HAVING COUNT(*) > 1
insert into J5C_MasterMeasures (studentid, measuredate, measureid, rit)
select A.studentid, A.measuredate, B.measurename+' ' +B.LabelName, A.score_14
from [J5C_Measures_Sys] A
join [J5C_ListBoxMeasures_Sys] B on A.MeasureID = B.MeasureID
join sysobjects so on so.name = 'J5C_Measures_Sys' AND so.type = 'u'
join syscolumns sc on so.id = sc.id and sc.name = 'score_14'
join [J5C_MeasureNamesV2_Sys] v on v.Score_field_id = sc.name
where a.score_14 is not null AND B.MEASURENAME IS NOT NULL
and (A.studentid NOT IN (SELECT studentid from #B)
and a.measuredate NOT IN (SELECT measuredate from #B)
and a.measureid NOT IN (SELECT measureid from #B))
use NOT EXISTS...NOT IN doesn't filter out NULLS
insert into J5C_MasterMeasures (studentid, measuredate, measureid, rit)
select A.studentid, A.measuredate, B.measurename+' ' +B.LabelName, A.score_14
from [J5C_Measures_Sys] A
join [J5C_ListBoxMeasures_Sys] B on A.MeasureID = B.MeasureID
join sysobjects so on so.name = 'J5C_Measures_Sys' AND so.type = 'u'
join syscolumns sc on so.id = sc.id and sc.name = 'score_14'
join [J5C_MeasureNamesV2_Sys] v on v.Score_field_id = sc.name
where a.score_14 is not null AND B.MEASURENAME IS NOT NULL
AND NOT EXISTS (select 1 from #B where #b.studentid = A.studentid
and a.measuredate = #B.measuredate
and a.measureid = #B.measureid)
and not exists (select 1 from J5C_MasterMeasures z
where z.studentid = A.studentid)
Just so you know, take a look at Select all rows from one table that don't exist in another table
Basically there are at least 5 ways to select all rows from onr table that are not in another table
NOT IN
NOT EXISTS
LEFT and RIGHT JOIN
OUTER APLY (2005+)
EXCEPT (2005+)
Here is a general solution for the difference operation using left join:
select * from FirstTable
left join SecondTable on FirstTable.ID = SecondTable.ID
where SecondTable.ID is null
Of course yours would have a more complicated join on clause, but the basic operation is the same.
I think you can use "NOT IN" with a subquery, but you say you have a multi-field key?
I'd be thinking about using a left outer join and then testing for null on the right...
Martin.