Omit duplicate values in one column and show the rest - select

I am starting with SQL, so my question will be easy for you.
I have one table with many columns. Most important are [Test number] and [Start time]. [Test number] contains many duplicated rows and [Start time] contains always unique records. And now I want to show in results all table columns, where only the [Test number] with maximum [Start time] will be used.
I was able to create one result regarding the maximum [Start time]
SELECT MAX([Start time])
FROM [Testing_ADB_Overview].[dbo].[HTT_DB]
GROUP BY [Test number]
ORDER BY MAX([Start time])
but I do not know how can I apply it to see in results all the other columns. Can you please help me?
MF

Try to get the [Test number] for the last [Stat time]. You can do it in two ways:
Store in a variable (#MaxStatTime) the result of the max from your select. Make a query of the [Test number] with the following where condition [Start time] = #MaxStartTime and store it in another variable (#LastTestNumber). And just retrieve the rows from your table corresponding to #LastTestNumber.
Using the ROW_NUMBER() function, you can order descending your rows in the table by the [Stat time] column. Then by filtering the table for row_number() = 1, you can obtain the [Test number], and again retrieve the rows from your table.
I'm not that fond about the 1st solution, so I will skip it's implementation, but here is an implementation of the 2nd solution using common table expression to make it more readable:
IF OBJECT_ID('tempdb.dbo.#HTT_DB') IS NOT NULL DROP TABLE #HTT_DB
CREATE TABLE #HTT_DB
( [Test number] NVARCHAR(100)
, [Start time] NVARCHAR(100)
)
INSERT INTO #HTT_DB( [Test number], [Start time] )
VALUES
('AA' , '210525_090000')
,('BB', '210525_080000')
,('CC', '210525_070000')
,('BB', '210525_060000')
,('AA', '210525_050000')
;WITH OrderedTestNumber AS (SELECT [Test number], [Start time], ROW_NUMBER() OVER ( PARTITION BY [Test number] ORDER BY [Start time] DESC) RowNo FROM #HTT_DB)
SELECT htt.*
FROM #HTT_DB htt INNER JOIN OrderedTestNumber otm ON otm.[Test number] = htt.[Test number] AND otm.[Start time] = htt.[Start time]
WHERE otm.RowNo = 1
Output:
Test number
Start time
AA
210525_090000
BB
210525_080000
CC
210525_070000
I would strongly advice adding a primary key column to ta table [HTT_DB], let's say PK_Column for the sake of simplicity. The query would bi simplifed as:
;WITH OrderedTestNumber AS (SELECT PK_Column, ROW_NUMBER() OVER ( PARTITION BY [Test number] ORDER BY [Start time] DESC) RowNo FROM #HTT_DB)
SELECT htt.*
FROM #HTT_DB htt INNER JOIN OrderedTestNumber otm ON otm.PK_Column = htt.PK_Column
WHERE otm.RowNo = 1

Related

Min date flag in select

I have a table with records for sales of products.
For the purpose of sales count a product should only be counted one time.
In this scenario a product is sold and reversed several times and we should only consider it in the month with minimum date and rest all the dates should be marked no.
Eample:
Product Month Sales flag
A Jan-01 Y
B Jan-01 Y
A Feb-01 N
C Feb-01 Y
How can I write a select from the table indicating as above. Any help would be appreciated.
Tried and failed.
The trick here is that ordering by "Jan-01", "Feb-01", etc... is tricky because you need to sort numeric values stored as text. This is one of the uses of a calendar table or data dimension. In my solution below I'm creating an on-the-fly date dimension table with "Month-number" you can sort by...
-- Sample data
DECLARE #table TABLE
(
Product CHAR(1) NOT NULL,
Mo CHAR(6) NOT NULL
)
INSERT #table VALUES
('A', 'Jan-01'),
('B', 'Jan-01'),
('A', 'Feb-01'),
('C', 'Feb-01');
-- Solution
SELECT f.Product, f.Mo, [Sales Flag] = CASE f.rnk WHEN 1 THEN 'Y' ELSE 'N' END
FROM
(
SELECT t.Product, i.Mo, rnk = ROW_NUMBER() OVER (PARTITION BY t.Product ORDER BY i.RN)
FROM #table AS t
JOIN
(
SELECT i.RN, Mo = LEFT(DATENAME(MONTH,DATEADD(MONTH, i.RN-1, '20010101')),3)+'-01'
FROM
(
SELECT RN = ROW_NUMBER() OVER (ORDER BY (SELECT 1))
FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) AS x(x)
) AS i
) AS i ON t.Mo = i.Mo
) AS f;
Returns:
Product Mo Sales Flag
------- ------ ----------
A Jan-01 Y
A Feb-01 N
B Jan-01 Y
C Feb-01 Y

Summation at individual and overall level

I have something as
Looking for an output as
I was trying with RollUp, cube, Grouping Set but nothing seems to be fitting properly.
Here is my unsuccessful attempt:
declare #t table(
[Employee Name] varchar(50),Bucket int,
[Start Inventory No] int ,[Start Inventory Amount] int,
[No Of Promise to Pay] int,[Promise to Pay Amount] int)
insert into #t
select 'A', 0,10,10000,3,100 union all
select 'A', 1,20,20000,7,500 union all
select 'B', 0,45,90000,4,200 union all
select 'B', 1,12,70000,6,600 union all
select 'c', 0,16,19000,1,500 union all
select 'c', 1,56,9000,10,2500
select
[Employee Name]
,Bucket=case when x.rn= 11 then 'total' else Bucket end
,[Start Inventory No]= case when x.rn= 11 then sum([Start Inventory No]) else [Start Inventory No] end
from
(select
rn=ROW_NUMBER() Over(partition by [Employee Name] order by (select 1)),
*
from #t
GROUP BY
Rollup
([Employee Name] ,Bucket,[Start Inventory No],[Start Inventory Amount],[No Of Promise to Pay],
[Promise to Pay Amount]))X where x.Rn in (1,6,11)
group by [Employee Name]
,Bucket, rn
This should be done with a pivot table on the client, not on the server.
If for some reason you do want to get to the second table from the first, I would do it as
select
case when grouping(fake_column) = 1 then null else [Employee Name] end as [Employee Name],
case when grouping([Employee Name]) = 1 and grouping(fake_column) = 1 then 'Gran Total' when grouping(fake_column) = 1 then 'Total' else cast(sum(Bucket) as varchar) end as Bucket,
sum([Start Inventory No]) as [Start Inventory No],
sum([Start Inventory Amount]) as [Start Inventory Amount],
sum([No Of Promise to Pay]) as [No Of Promise to Pay],
sum([Promise to Pay Amount]) as [Promise to Pay Amount]
from
(select *, row_number() over(partition by [Employee Name] order by 1/0) as fake_column from #t) data
group by
rollup([Employee Name], fake_column)
;
The idea is that you make each row unique by introducing a fake column, and include that column in the grouping, so that the original rows come out as 'grouped' results too (each 'group' contains one row due to the unique number).

postgres - get top category purchased by customer

I have a denormalized table with the columns:
buyer_id
order_id
item_id
item_price
item_category
I would like to return something that returns 1 row per buyer_id
buyer_id, sum(item_price), item_category
-- but ONLY for the category with the highest rank of sales along that specific buyer_id.
I can't get row_number() or partition to work because I need to order by the sum of item_price relative to item_category relative to buyer. Am I overlooking anything obvious?
You need a few layers of fudging here:
SELECT buyer_id, item_sum, item_category
FROM (
SELECT buyer_id,
rank() OVER (PARTITION BY buyer_id ORDER BY item_sum DESC) AS rnk,
item_sum, item_category
FROM (
SELECT buyer_id, sum(item_price) AS item_sum, item_category
FROM my_table
GROUP BY 1, 3) AS sub2) AS sub
WHERE rnk = 1;
In sub2 you calculate the sum of 'item_price' for each 'item_category' for each 'buyer_id'. In sub you rank these with a window function by 'buyer_id', ordering by 'item_sum' in descending order (so the highest 'item_sum' comes first). In the main query you select those rows where rnk = 1.

Aggregates in Visual Studio Reports not showing

I'm trying to create a report that UNIONs two datasets. It takes (1) a bunch of orders for a specific customer within a date range, and UNIONs it with (headers and) (2) the method of shipping following by the average of the time interval between the order being placed and the order being sent.
The screenshot below shows that, in SQL Server, the query works perfectly. However, when I run this exact same query in Visual Studio 2008 to create a report for this, the actual value of the average turnaround time is empty.
As far as I can tell, in SQL Server the query works perfectly for whatever parameters I give it. I just can't figure out why in the report the average turnaround time is always blank.
The query I'm running is:
DECLARE #turnaroundInfo TABLE
(
[Owner Reference] VARCHAR(48),
[Project] VARCHAR(48),
[Carrier Type] VARCHAR(48),
[Created Date] DATETIME,
[Shipped Date] DATETIME,
[Turnaround Time (hours)] INT
)
INSERT INTO #turnaroundInfo
SELECT orders.ownerReference AS [Owner Reference], p.name AS [Project], types.name AS [Carrier Type], orders.createdSysDateTime AS [Created Date], shipments.shippedDate AS [Shipped Date], DATEDIFF(HOUR, orders.createdSysDateTime, shipments.shippedDate) AS [Turnaround Time (hours)]
FROM datex_footprint.Orders orders
INNER JOIN datex_footprint.Projects p ON orders.projectId = p.id
INNER JOIN datex_footprint.CarrierServiceTypes types ON orders.preferredCarrierServiceTypeId = types.id
INNER JOIN datex_footprint.OrderLines lines ON orders.id = lines.orderId
INNER JOIN datex_footprint.Shipments shipments ON lines.shipmentId = shipments.id
WHERE p.name IN (#project) AND types.name IN(#carrier)
-- Get only the type and date-ranged turnaround info we want
DECLARE #orders TABLE
(
[Owner Reference] VARCHAR(48),
[Project] VARCHAR(48),
[Carrier Type] VARCHAR(48),
[Created Date] DATETIME,
[Shipped Date] DATETIME,
[Turnaround Time (hours)] INT
)
INSERT INTO #orders
SELECT *
FROM #turnaroundInfo
WHERE [Turnaround Time (hours)] >= 0 AND [Created Date] BETWEEN #startDate AND #endDate
ORDER BY [Turnaround Time (hours)], [Carrier Type] ;
-- UNION the relevant turnaround infor with headers
SELECT * FROM #orders o /* All the orders in the date range for this project and the selected carrier(s) */
UNION ALL
SELECT 'Carrier' AS [Carrier Type], 'Avg Turnaround Time' AS [Average Turnaround], NULL AS Column3, NULL AS Column4, NULL AS Colummn5, NULL AS Column6
UNION ALL
SELECT o.[Carrier Type], CAST(AVG(o.[Turnaround Time (hours)]) AS NVARCHAR(24)) AS [Average Turnaround], NULL AS Column3, NULL AS Column4, NULL AS Colummn5, NULL AS Column6
FROM #orders o
GROUP BY o.[Carrier Type];
Does anybody know or see what I might be missing?
Any help would be appreciated!
It's not blank, it's just might not in the column you expected - I can see the value '24' in your screenshot.
I figured out what my mistake was.
The column about the value 24 and the header and 24 value column were sized differently. In SQL Server, it didn't seem to care, but in Visual Studio it saw the size difference and actually dropped the whole column from displaying.
After I adjusted the average value column to VARCHAR(48), which is what the column above it was sized to, it displayed properly again.

Is it possible to have "not in/exists" in one table then in another table?

I'm now maintaining a big app somebody else wrote mining some data from some big government legacy systems. Basically I need a single query result to populate a gridview that takes each part number from a Tech Order and counts matching part numbers in the Fedlog table. If none found then look in the "commercial" table. The existing query currently only looks in the Fedlog table and reads as follows:
select p.*,
(select case when count(*) > 0 then 'Y' else 'N' end as SL
from tbl_fedlog where [Part Number] = p.[Part Number]) as SL
from tbl_pcms p
where p.[Tech Order] = '0B-E0C-9' order by p.Figure, p.[Index], p.Indenture
When 'N' I've got to look in the commerical table. Could I have some suggestion on the best way to go about this?
What should the result look like for the 'N' scenario - just another 'Y'/'N' answer? If so, you should be able to simply replace the 'N' expression with another scalar query against the "commercial" table.
Here is a UNION ALL between a Fedlog query and a commercial query, that checks each for existence:
SELECT p.*
, SL = 'Y'
, part_count = (SELECT COUNT(*) FROM tbl_fedlog WHERE [Part Number] = p.[Part Number])
from tbl_pcms p
where p.[Tech Order] = '0B-E0C-9'
AND EXISTS(SELECT 1 FROM tbl_fedlog WHERE [Part Number] = p.[Part Number])
UNION ALL
SELECT p.*
, SL = 'Y'
, part_count = (SELECT COUNT(*) FROM tbl_commercial WHERE [Part Number] = p.[Part Number])
from tbl_pcms p
where p.[Tech Order] = '0B-E0C-9'
AND EXISTS(SELECT 1 FROM tbl_commercial WHERE [Part Number] = p.[Part Number])
AND NOT EXISTS(SELECT 1 FROM tbl_fedlog WHERE [Part Number] = p.[Part Number])
I would need more info to address the 'N' scenario mentioned in other answer though.
If the logic is to display 'Y' when the part exists in either the Fedlog or the Commercial table and 'N' otherwise, then you could try grouping & aggregating those tables separately and (outer-)joining the aggregated result sets to tbl_pcms, like this:
SELECT
p.*,
CASE WHEN COALESCE(f.PartCount, c.PartCount) IS NULL THEN 'N' ELSE 'Y' END AS SL
FROM tbl_pcms p
LEFT JOIN (
SELECT
[Part Number],
COUNT(*) AS PartCount
FROM tbl_fedlog
GROUP BY [Part Number]
) f ON p.[Part Number] = f.[Part Number]
LEFT JOIN (
SELECT
[Part Number],
COUNT(*) AS PartCount
FROM tbl_commercial
GROUP BY [Part Number]
) c ON p.[Part Number] = c.[Part Number]
WHERE p.[Tech Order] = '0B-E0C-9'
ORDER BY
p.Figure,
p.[Index],
p.Indenture