Merge multible rows into one - tsql

I'm trying to merge multiple rows into one
First some test data
declare #Customers table
(
CustomerID varchar(200),
SubscriptionId varchar(200),
OpusNr varchar(200),
LineType varchar(1),
ProductFrame varchar(200),
Frame varchar(200),
ProductLeftLens varchar(200),
LeftLens varchar(200),
ProductRightLens varchar(200),
RightLens varchar(200)
);
insert into #Customers values ('17762697', '270387', '6214005562', 'F', '304', 'GG0550O/006/5316/140', '304', 'Variview Standard M 1.6 70/75', '304', 'Variview Standard M 1.6 70/75')
insert into #Customers values ('17762697', '270387', '6214005562', 'L', '101', 'GG0550O/006/5316/140', '101', 'Variview Standard M 1.6 70/75', '101', 'Variview Standard M 1.6 70/75')
insert into #Customers values ('17762697', '270387', '6214005562', 'R', '303', 'GG0550O/006/5316/140', '303', 'Variview Standard M 1.6 70/75', '303', 'Variview Standard M 1.6 70/75')
And the result
I would like the rows merged into one
if LineType = 'F' then I want the value from ProductFrame and Frame
if LineType = 'L' then I want the value from ProductLeftLens
and LeftLens
if LineType = 'R' then I want the value from ProductRightLens and RightLens
Here I've done the Merge by hand on my DemoData
Is is on a SQL server 2017

This should do the trick:
SELECT
CustomerID,
SubscriptionId = MAX(c.SubscriptionId),
OpusNr = MAX(c.OpusNr),
ProductFrame = MAX(CASE c.LineType WHEN 'F' THEN c.ProductFrame END),
Frame = MAX(c.Frame),
ProductLeftLens = MAX(CASE c.LineType WHEN 'L' THEN c.ProductLeftLens END),
LeftLens = MAX(c.LeftLens),
ProductRightLens = MAX(CASE c.LineType WHEN 'L' THEN c.ProductRightLens END),
RightLens = MAX(c.RightLens)
FROM #Customers AS c
GROUP BY c.CustomerID, c.SubscriptionId , c.OpusNr;
Returns:
CustomerID SubscriptionId OpusNr ProductFrame Frame ProductLeftLens LeftLens ProductRightLens RightLens
------------ ---------------- ------------- ------------- ---------------------- ----------------- ------------------------------- ------------------- ---------------------------------
17762697 270387 6214005562 304 GG0550O/006/5316/140 101 Variview Standard M 1.6 70/75 101 Variview Standard M 1.6 70/75

Related

Unpivot Columns with Most Recent Record

Student Records are updated for subject and update date. Student can be enrolled in one or multiple subjects. I would like to get each student record with most subject update date and status.
CREATE TABLE Student
(
StudentID int,
FirstName varchar(100),
LastName varchar(100),
FullAddress varchar(100),
CityState varchar(100),
MathStatus varchar(100),
MUpdateDate datetime2,
ScienceStatus varchar(100),
SUpdateDate datetime2,
EnglishStatus varchar(100),
EUpdateDate datetime2
);
Desired query output, I am using CTE method but trying to find alternative and better way.
SELECT StudentID, FirstName, LastName, FullAddress, CityState, [SubjectStatus], UpdateDate
FROM Student
;WITH orginal AS
(SELECT * FROM Student)
,Math as
(
SELECT DISTINCT StudentID, FirstName, LastName, FullAddress, CityState,
ROW_NUMBER OVER (PARTITION BY StudentID, MathStatus ORDER BY MUpdateDate DESC) as rn
, _o.MathStatus as SubjectStatus, _o.MupdateDate as UpdateDate
FROM original as o
left join orignal as _o on o.StudentID = _o.StudentID
where _o.MathStatus is not null and _o.MUpdateDate is not null
)
,Science AS
(
...--Same as Math
)
,English AS
(
...--Same As Math
)
SELECT * FROM Math WHERE rn = 1
UNION
SELECT * FROM Science WHERE rn = 1
UNION
SELECT * FROM English WHERE rn = 1
First: storing data in a denormalized form is not recommended. Some data model redesign might be in order. There are multiple resources about data normalization available on the web, like this one.
Now then, I made some guesses about how your source table is populated based on the query you wrote. I generated some sample data that could show how the source data is created. Besides that I also reduced the number of columns to reduce my typing efforts. The general approach should still be valid.
Sample data
create table Student
(
StudentId int,
StudentName varchar(15),
MathStat varchar(5),
MathDate date,
ScienceStat varchar(5),
ScienceDate date
);
insert into Student (StudentID, StudentName, MathStat, MathDate, ScienceStat, ScienceDate) values
(1, 'John Smith', 'A', '2020-01-01', 'B', '2020-05-01'),
(1, 'John Smith', 'A', '2020-01-01', 'B+', '2020-06-01'), -- B for Science was updated to B+ month later
(2, 'Peter Parker', 'F', '2020-01-01', 'A', '2020-05-01'),
(2, 'Peter Parker', 'A+', '2020-03-01', 'A', '2020-05-01'), -- Spider-Man would never fail Math, fixed...
(3, 'Tom Holland', null, null, 'A', '2020-05-01'),
(3, 'Tom Holland', 'A-', '2020-07-01', 'A', '2020-05-01'); -- Tom was sick for Math, but got a second chance
Solution
Your question title already contains the word unpivot. That word actually exists in T-SQL as a keyword. You can learn about the unpivot keyword in the documentation. Your own solution already contains common table expression, these constructions should look familiar.
Steps:
cte_unpivot = unpivot all rows, create a Subject column and place the corresponding values (SubjectStat, Date) next to it with a case expression.
cte_recent = number the rows to find the most recent row per student and subject.
Select only those most recent rows.
This gives:
with cte_unpivot as
(
select up.StudentId,
up.StudentName,
case up.[Subject]
when 'MathStat' then 'Math'
when 'ScienceStat' then 'Science'
end as [Subject],
up.SubjectStat,
case up.[Subject]
when 'MathStat' then up.MathDate
when 'ScienceStat' then up.ScienceDate
end as [Date]
from Student s
unpivot ([SubjectStat] for [Subject] in ([MathStat], [ScienceStat])) up
),
cte_recent as
(
select cu.StudentId, cu.StudentName, cu.[Subject], cu.SubjectStat, cu.[Date],
row_number() over (partition by cu.StudentId, cu.[Subject] order by cu.[Date] desc) as [RowNum]
from cte_unpivot cu
)
select cr.StudentId, cr.StudentName, cr.[Subject], cr.SubjectStat, cr.[Date]
from cte_recent cr
where cr.RowNum = 1;
Result
StudentId StudentName Subject SubjectStat Date
----------- --------------- ------- ----------- ----------
1 John Smith Math A 2020-01-01
1 John Smith Science B+ 2020-06-01
2 Peter Parker Math A+ 2020-03-01
2 Peter Parker Science A 2020-05-01
3 Tom Holland Math A- 2020-07-01
3 Tom Holland Science A 2020-05-01

Min date flag in select

I have a table with records for sales of products.
For the purpose of sales count a product should only be counted one time.
In this scenario a product is sold and reversed several times and we should only consider it in the month with minimum date and rest all the dates should be marked no.
Eample:
Product Month Sales flag
A Jan-01 Y
B Jan-01 Y
A Feb-01 N
C Feb-01 Y
How can I write a select from the table indicating as above. Any help would be appreciated.
Tried and failed.
The trick here is that ordering by "Jan-01", "Feb-01", etc... is tricky because you need to sort numeric values stored as text. This is one of the uses of a calendar table or data dimension. In my solution below I'm creating an on-the-fly date dimension table with "Month-number" you can sort by...
-- Sample data
DECLARE #table TABLE
(
Product CHAR(1) NOT NULL,
Mo CHAR(6) NOT NULL
)
INSERT #table VALUES
('A', 'Jan-01'),
('B', 'Jan-01'),
('A', 'Feb-01'),
('C', 'Feb-01');
-- Solution
SELECT f.Product, f.Mo, [Sales Flag] = CASE f.rnk WHEN 1 THEN 'Y' ELSE 'N' END
FROM
(
SELECT t.Product, i.Mo, rnk = ROW_NUMBER() OVER (PARTITION BY t.Product ORDER BY i.RN)
FROM #table AS t
JOIN
(
SELECT i.RN, Mo = LEFT(DATENAME(MONTH,DATEADD(MONTH, i.RN-1, '20010101')),3)+'-01'
FROM
(
SELECT RN = ROW_NUMBER() OVER (ORDER BY (SELECT 1))
FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) AS x(x)
) AS i
) AS i ON t.Mo = i.Mo
) AS f;
Returns:
Product Mo Sales Flag
------- ------ ----------
A Jan-01 Y
A Feb-01 N
B Jan-01 Y
C Feb-01 Y

Cumulative sum with group by and join

I'm a little struggled with finding a clean way to do this. Assume that I have the following records in my table named Records:
|Name| |InsertDate| |Size|
john 30.06.2015 1
john 10.01.2016 10
john 12.01.2016 100
john 05.03.2016 1000
doe 01.01.2016 1
How do I get the records for year of 2016 and month is equal to or less than 3 grouped by month(even that month does not exists e.g. month 2 in this case) with cumulative sum of Size including that month? I want to get the result as the following:
|Name| |Month| |Size|
john 1 111
john 2 111
john 3 1111
doe 1 1
As other commenters have already stated, you simply need a table with dates in that you can join from to give you the dates that your source table does not have records for:
-- Build the source data table.
declare #t table(Name nvarchar(10)
,InsertDate date
,Size int
);
insert into #t values
('john','20150630',1 )
,('john','20160110',10 )
,('john','20160112',100 )
,('john','20160305',1000)
,('doe' ,'20160101',1 );
-- Specify the year you want to search for by storing the first day here.
declare #year date = '20160101';
-- This derived table builds a set of dates that you can join from.
-- LEFT JOINing from here is what gives you rows for months without records in your source data.
with Dates
as
(
select #year as MonthStart
,dateadd(day,-1,dateadd(month,1,#year)) as MonthEnd
union all
select dateadd(month,1,MonthStart)
,dateadd(day,-1,dateadd(month,2,MonthStart))
from Dates
where dateadd(month,1,MonthStart) < dateadd(yyyy,1,#year)
)
select t.Name
,d.MonthStart
,sum(t.Size) as Size
from Dates d
left join #t t
on(t.InsertDate <= d.MonthEnd)
where d.MonthStart <= '20160301' -- Without knowing what your logic is for specifying values only up to March, I have left this part for you to automate.
group by t.Name
,d.MonthStart
order by t.Name
,d.MonthStart;
If you have a static date reference table in your database, you don't need to do the derived table creation and can just do:
select d.DateValue
,<Other columns>
from DatesReferenceTable d
left join <Other Tables> o
on(d.DateValue = o.AnyDateColumn)
etc
Here's another approach that utilizes a tally table (aka numbers table) to create the date table. Note my comments.
-- Build the source data table.
declare #t table(Name nvarchar(10), InsertDate date, Size int);
insert into #t values
('john','20150630',1 )
,('john','20160110',10 )
,('john','20160112',100 )
,('john','20160305',1000)
,('doe' ,'20160101',1 );
-- A year is fine, don't need a date data type
declare #year smallint = 2016;
WITH -- dummy rows for a tally table:
E AS (SELECT E FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t(e)),
dateRange(totalDays, mn, mx) AS -- Get the range and number of months to create
(
SELECT DATEDIFF(MONTH, MIN(InsertDate), MAX(InsertDate)), MIN(InsertDate), MAX(InsertDate)
FROM #t
),
iTally(N) AS -- Tally Oh! Create an inline Tally (aka numbers) table starting with 0
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1))-1
FROM E a CROSS JOIN E b CROSS JOIN E c CROSS JOIN E d
),
RunningTotal AS -- perform a running total by year/month for each person (Name)
(
SELECT
yr = YEAR(DATEADD(MONTH, n, mn)),
mo = MONTH(DATEADD(MONTH, n, mn)),
Name,
Size = SUM(Size) OVER
(PARTITION BY Name ORDER BY YEAR(DATEADD(MONTH, n, mn)), MONTH(DATEADD(MONTH, n, mn)))
FROM iTally
CROSS JOIN dateRange
LEFT JOIN #t ON MONTH(InsertDate) = MONTH(DATEADD(MONTH, n, mn))
WHERE N <= totalDays
) -- Final output will only return rows where the year matches #year:
SELECT
name = ISNULL(name, LAG(Name, 1) OVER (ORDER BY yr, mo)),
yr, mo,
size = ISNULL(Size, LAG(Size, 1) OVER (ORDER BY yr, mo))
FROM RunningTotal
WHERE yr = #year
GROUP BY yr, mo, name, size;
Results:
name yr mo size
---------- ----------- ----------- -----------
doe 2016 1 1
john 2016 1 111
john 2016 2 111
john 2016 3 1111

TSQL query to return values from a table where there are multiple rows with same ID into a single row but each unique value in a different column

I'm trying to return values from a table so that I get 1 row per purchaseID and return multiple columns with Buyers First and Last Names.
E.G
I have a table with the following Data
| PurchaseID | FirstName | LastName|
|---------1------- | ----Joe------ | ---Smith----|
|---------1------- | -----Peter--- | ---Pan------|
|---------2------- | ----Max------|---Power----|
|---------2------- | -----Jack---- | ---Frost----|
I'm trying to write a query that returns the values like so
| PurchaseID | Buyer1FirstName | Buyer1LastName | Buyer2FirstName |Buyer2LastName|
|--------1---------|------------Joe--------- |--------Smith----------|---------Peter-----------|--------Pan------------|
|--------2---------|-------------Max--------|---------Power--------|---------Jack -----------|---------Frost----------|
I've been looking online but because I'm not sure how to explain in words what I want to do, I'm not having much luck. I'm hoping with a more visual explanation someone could point me in the right direction.
Any help would be awesome.
You can use ROW_NUMBER as the below:
DECLARE #Tbl TABLE (PurchaseID INT, FirstName VARCHAR(50), LastName VARCHAR(50))
INSERT INTO #Tbl
VALUES
(1, 'Joe', 'Smith'),
(1, 'Peter', 'Pan'),
(2, 'Max', 'Power'),
(2, 'Jack', 'Frost'),
(2, 'Opss', 'Sspo')
;WITH CTE
AS
(
SELECT
*, ROW_NUMBER() OVER (PARTITION BY PurchaseID ORDER BY PurchaseID) RowId
FROM #Tbl
)
SELECT
A.PurchaseID,
MIN(CASE WHEN A.RowId = 1 THEN A.FirstName END) Buyer1FirstName,
MIN(CASE WHEN A.RowId = 1 THEN A.LastName END ) Buyer1LastName ,
MIN(CASE WHEN A.RowId = 2 THEN A.FirstName END) Buyer2FirstName ,
MIN(CASE WHEN A.RowId = 2 THEN A.LastName END )Buyer2LastName,
MIN(CASE WHEN A.RowId = 3 THEN A.FirstName END) Buyer3FirstName ,
MIN(CASE WHEN A.RowId = 3 THEN A.LastName END )Buyer3LastName,
MIN(CASE WHEN A.RowId = 4 THEN A.FirstName END) Buyer4FirstName ,
MIN(CASE WHEN A.RowId = 4 THEN A.LastName END )Buyer4LastName
FROM
CTE A
GROUP BY
A.PurchaseID
Result:
PurchaseID Buyer1FirstName Buyer1LastName Buyer2FirstName Buyer2LastName Buyer3FirstName Buyer3LastName Buyer4FirstName Buyer4LastName
----------- ------------------- -------------------- -------------------- ------------------ ------------------- ----------------- ------------------- --------------
1 Joe Smith Peter Pan NULL NULL NULL NULL
2 Max Power Jack Frost Opss Sspo NULL NULL

Full Outer Self Join [duplicate]

This question already has an answer here:
SQL Full Outer Join on same column in same table
(1 answer)
Closed 9 years ago.
The problem is to return the rows which contain nulls as well. Below is SQL code to create table and populate it with sample data.
I'm expecting below, but query does not show the two rows with null values.
src_t1 id1_t1 id2_t1 val_t1 src_t2 id1_t2 id2_t2 val_t2
b z z 4
a w w 100 b w w 1
a x x 200 b x x 2
a y y 300
Data:
CREATE TABLE sample (
src VARCHAR(6)
,id1 VARCHAR(6)
,id2 VARCHAR(6)
,val FLOAT
);
INSERT INTO sample (src, id1, id2, val)
VALUES ('a', 'w', 'w', 100)
,('b', 'w', 'w', 1)
,('a', 'x', 'x', 200)
,('b', 'x', 'x', 2)
,('a', 'y', 'y', 300)
,('b', 'z', 'z', 4)
;
This is my test query. It does not show results when t1.src = 'a' and t1.id1 = 'y' or when t2.src = 'b' and t2.id1 = 'z'.
Why?
What's the correct query?
SELECT t1.src, t1.id1, t1.id2, t1.val
,t2.src as src2, t2.id1, t2.id2, t2.val
FROM sample t1 FULL OUTER JOIN sample t2
ON t1.id1 = t2.id1 AND t1.id2 = t2.id2
WHERE (t1.src = 'a' AND t2.src = 'b')
OR (t1.src IS NULL AND t1.id1 IS NULL AND t1.id2 IS NULL)
OR (t2.src IS NULL AND t2.id1 IS NULL AND t2.id2 IS NULL)
I've also tried moving the conditions in the WHERE clause to the ON clause as well.
TIA.
The WHERE clause evaluates too late, effectively converting your query into an inner join.
Instead, write your query like this using proper JOIN syntax:
SELECT t1.src, t1.id1, t1.id2, t1.val
,t2.src as src2, t2.id1, t2.id2, t2.val
FROM (
select * from sample
where src='a'
) t1 FULL OUTER JOIN (
select * from sample
where src='b'
) t2
ON t1.id1 = t2.id1 AND t1.id2 = t2.id2
yielding this result set:
src id1 id2 val src2 id1 id2 val
---- ---- ---- ----------- ---- ---- ---- -----------
a w w 100 b w w 1
a x x 200 b x x 2
NULL NULL NULL NULL b z z 4
a y y 300 NULL NULL NULL NULL
Update:
Note also the use of two sub-queries to clearly separate the source table into two distinct relvars. I missed this for a minute on my first submission.
Actually, I think the solution is a bit cleaner if a CTE is used:
WITH A AS (
select * from sample where src='a'
),
B AS (
select * from sample where src='b'
)
SELECT *
FROM A FULL OUTER JOIN B
ON A.ID1 = B.ID1 AND A.ID2 = B.ID2
;