Need better summation select statement in postgres function - postgresql

I've got two tables in my database. One of them, 'orders', contains a set of columns with an integer which represents what the order should contain (like 5 of A and 15 of B). The second table, 'production_work', contains those same order columns, and a date, so whenever somebody completes part of an order, I track it.
So now i need a fast way to know which orders are completed, and I'm hoping to avoid a 'completed' table on the first column as orders are editable and it's just more logic to keep correct.
This query works, but it's horribly written. What's a better way to do this? There are actually 12 of these columns that go into this query...I'm just showing 3 of them for the example.
SELECT *
FROM orders o
WHERE ud = (SELECT SUM(ud) FROM production_work WHERE order_id = o.ident)
AND dp = (SELECT SUM(dp) FROM production_work WHERE order_id = o.ident)
AND swrv = (SELECT SUM(swrv) FROM production_work WHERE order_id = o.ident)

select o.*
from
orders o
inner join
(
select order_id, sum(ud) as ud, sum(dp) as dp, sum(swrv) as swrv
from production_work
group by order_id
) pw on o.ident = pw.order_id
where
o.ud = pw.ud
and o.dp = pw.dp
and o.swrv = pw.swrv

Related

SQL left join on maximum date

I have two tables: contracts and contract_descriptions.
On contract_descriptions there is a column named contract_id which is equal on contracts table records.
I am trying to join the latest record on contract_descriptions:
SELECT *
FROM contracts c
LEFT JOIN contract_descriptions d ON d.contract_id = c.contract_id
AND d.date_description =
(SELECT MAX(date_description)
FROM contract_descriptions t
WHERE t.contract_id = c.contract_id)
It works, but is it the performant way to do it? Is there a way to avoid the second SELECT?
You could also alternatively use DISTINCT ON:
SELECT * FROM contracts c LEFT JOIN (
SELECT DISTINCT ON (cd.contract_id) cd.* FROM contract_descriptions cd
ORDER BY cd.contract_id, cd.date_description DESC
) d ON d.contract_id = c.contract_id
DISTINCT ON selects only one row per contract_id while the sort clause cd.date_description DESC ensures that it is always the last description.
Performance depends on many values (for example, table size). In any case, you should compare both approaches with EXPLAIN.
Your query looks okay to me. One typical way to join only n rows by some order from the other table is a lateral join:
SELECT *
FROM contracts c
CROSS JOIN LATERAL
(
SELECT *
FROM contract_descriptions cd
WHERE cd.contract_id = c.contract_id
ORDER BY cd.date_description DESC
FETCH FIRST 1 ROW ONLY
) cdlast;

Delete duplicate rows based on columns

I have a table called Aircraft and there are many records. The problem is that some are duplicates. I know how to select the duplicates and their counts:
SELECT flight_id, latitude, longitude, altitude, call_sign, measurement_time, COUNT(*)
FROM Aircraft
GROUP BY flight_id, latitude, longitude, altitude, call_sign, measurement_time
HAVING COUNT(*) > 1;
This returns something like:
Now, what I need to do is remove the duplicates, leaving just one each so that when I run the query again, all counts become 1.
I know that I can use the DELETE keyword, but I'm not sure how to delete it from the SELECT.
I'm sure I am missing an easy step, but I do not want to ruin my DB being a newbie.
How do I do this?
SELECT
flight_id, latitude, longitude, altitude, call_sign, measurement_time
FROM Aircraft a
WHERE EXISTS (
SELECT * FROM Aircraft x
WHERE x.flight_id = a.flight_id
AND x.latitude = a.latitude
AND x.longitude = a.longitude
AND x.altitude = a.altitude
AND x.call_sign = a.call_sign
AND x.measurement_time = a.measurement_time
AND x.id < a.id
)
;
If the query above returns thecorrect rows (to be deleted)
you can change it into a delete statement:
DELETE
FROM Aircraft a
WHERE EXISTS (
SELECT * FROM Aircraft x
WHERE x.flight_id = a.flight_id
AND x.latitude = a.latitude
AND x.longitude = a.longitude
AND x.altitude = a.altitude
AND x.call_sign = a.call_sign
AND x.measurement_time = a.measurement_time
AND x.id < a.id
)
;
I have always used the CTE method in SQL SERVER. This allows you to define columns that you want to compare, once you have established what columns make up a duplicate, then you can assign a CTE value to it and then go back and cleanup the CTE values that are greater than 1. This is an example of duplicate checking that I do.
WITH CTE AS
(select d.UID
,d.LotKey
,d.SerialNo
,d.HotWeight
,d.MarketValue
,RN = ROW_NUMBER()OVER(PARTITION BY d.HotWeight, d.serialNo, d.MarketValue order by d.SerialNo)
from LotDetail d
where d.LotKey = ('1~20161019~305')
)
DELETE FROM CTE WHERE RN <> 1
In my example I am looking at the LotDetail table where the d.hotweight and d.serial no are matching. if there is a match then the original gets CTE 1 and any duplicates get CTE 2 or greater depending on the amount of duplicates. Then you use the last DELETE statement to clear the entries that come up as duplicate. THis is really flexible so you should be able to adapt it to your issue.
Here is an example tailored to your situation.
WITH CTE AS
(select d.Flight_ID
,d.Latitude
,d.Longitude
,d.Altitude
,d.Call_sign
,d.Measurement*
,RN = ROW_NUMBER()OVER(PARTITION BY d.Flight_ID, d.Latitude, d.Longitude, d.Altitude, d.Call_Sign, d.Measurement* order by d.SerialNo)
from Aircraft d
where d.flight_id = ('**INSERT VALUE HERE')
)
DELETE FROM CTE WHERE RN <> 1
If it's a one-time operation you can create a temp table with the same schema and then copy unique rows over like so:
insert into Aircraft_temp
select distinct on (flight_id, measurement_time) Aircraft.* from Aircraft
Then swap them out by renaming, or truncate Aircraft and copy the temp contents back (truncate Aircraft; insert into Aircraft select * from Aircraft_temp;).
Safer to rename Aircraft to Aircraft_old and Aircraft_temp to Aircraft so you keep your original data until you are sure things are correct. Or at least check that the number of rows in your count query above match the count of rows in the temp table before doing the truncate.
Update2: With a separate valid primary key (assuming it is called id) you can do a DELETE based on a self join like this:
delete from Aircraft using (
select a1.id
from Aircraft a1
left join (select flight_id, measurement_time, min(id) as id from Aircraft group by 1,2) a2
on a1.id = a2.id
where a2.id is null
) as d
where Aircraft.id=d.id
This finds the minimum id (could do max too for the "latest") for each flight and identifies all the records from the full set having an id that is not the minimum (no match in the join). The unmatched ids are deleted.

UPDATE column from one table to another

I need to update a column in a table to the latest date/time combination from another table. How can I get the latest date/time combination from the one table and then update a column with that date in another table?
The two tables I am using are called dbo.DD and dbo.PurchaseOrders. The JOIN between the two tables are dbo.DueDate.XDORD = dbo.PurchaseOrders.PBPO AND dbo.DueDate.XDLINE = dbo.PurchaseOrders.PBSEQ. The columns from dbo.DueDate that I need the latest date/time from are dbo.DueDate.XDCCTD and dbo.DueDate.XDCCTT.
I need to set dbo.PurchaseOrders.PBDUE = dbo.DueDate.XDCURDT.I can't use an ORDER BY statement in the UPDATE statement, so I'm not sure how to do this. I know row_number sometimes works in these situations, but I'm unsure of how to implement.
The general pattern is:
;WITH s AS
(
SELECT
key, -- may be multiple columns
date_col,
rn = ROW_NUMBER() OVER
(
PARTITION BY key -- again, may be multiple columns
ORDER BY date_col DESC
)
FROM dbo.SourceTable
)
UPDATE d
SET d.date_col = s.date_col
FROM dbo.DestinationTable AS d
INNER JOIN s
ON d.key = s.key -- one more time, may need multiple columns here
WHERE s.rn = 1;
I didn't try to map your table names and columns because (a) I didn't get from your word problem which table was the source and which was the destination and (b) those column names look like alphabet soup and I would have screwed them up anyway.
Did seem though that the OP got this specific code working:
;WITH s AS
(
SELECT
XDORD, XDLINE,
XDCURDT,
rn = ROW_NUMBER() OVER
(
PARTITION BY XDORD, XDLINE
ORDER BY XDCCTD DESC, XDCCTT desc
)
FROM dbo.DueDate
)
UPDATE d
SET d.PBDUE = s.XDCURDT
FROM dbo.PurchaseOrders AS d
INNER JOIN s
ON d.PBPO = s.XDORD AND d.PBSEQ = s.XDLINE
WHERE s.rn = 1;

Updating with Nested Select Statements

I have a table that holds 3 fields of data: Acct#, YMCode, and EmployeeID. The YMCode is an Int that is formatted 201308, 201307, etc. For each Acct#, I need to select the EmployeedID used for the YMCode 201308 and then update all of the other YMCodes for the Acct# to the EmployeedID used in 201308.
so for each customer account in the table...
Update MyTable
Set EmployeeID = EmployeeID used in YMCode 201308
Having a hard time with it.
Put it in a transaction and look at the results before committing, but I think this is what you want:
UPDATE b
SET EmployeeID = a.EmployeeID
FROM MyTable a
INNER JOIN MyTable b
ON a.[Acct#] = b.[Acct#]
where a.YMCode =
(SELECT MAX(YMCode) from MyTable)
To get max YMCode, just add select statement at the end.

T-SQL Problem converting a Cursor into a SET based operation

Basically I have this cursor that was not written by me but is taking some time to process and I was wanting to try and improve it by getting rid of the cursor all together.
Here is the code:
DECLARE #class_id int, #title_code varchar(30)
DECLARE title_class CURSOR FOR
SELECT DISTINCT title_code FROM tmp_business_class_titles (NOLOCK)
OPEN title_class
FETCH title_class INTO #title_code
WHILE ##FETCH_STATUS = 0
BEGIN
SELECT TOP 1 #class_id = bc1.categoryid
FROM tmp_business_class_titles bct,
dbo.Categories bc1 (nolock)
join dbo.Categories bc2 (nolock) on bc2.categoryid = bc1.highercategoryid
join dbo.Categories bc3 (nolock) on bc3.categoryid = bc2.highercategoryid
WHERE bc1.categoryid = bct.class_id
AND title_code = #title_code
ORDER BY Default_Flag DESC
UPDATE products
SET subcategoryid = #class_id
WHERE ccode = #title_code
AND spdisplaytype = 'Table'
UPDATE products
SET subcategoryid = #class_id
WHERE highercatalogid IN (
SELECT catalogid FROM products (nolock)
WHERE ccode = #title_code AND spdisplaytype = 'Table')
FETCH title_class INTO #title_code
END
CLOSE title_class
DEALLOCATE title_class
The table tmp_business_class_titles looks like this:
class_id,title_code,Default_flag
7,101WGA,0
7,10315,0
29,8600,0
The default flag can always be 0 but if it is 1 then the logic should automatically pick the default class_id for that title_id.
So the current logic loops through the above table in a cursor and then selects the top 1 class id for each title, ordered by the the default flag (so the class_id with a default_flag of 1 should always be returned first.) and applies the default class_id to the products table.
This code takes around 1:20 to run and I am trying to convert this into one or 2 update statements but I have exhausted my brain in doing so.
Any TSQL Guru's have any ideas if this is possible or should I re-evaluate the entire logic on how the default flag works?
cheers for any help.
I don't have quite enough information to work with, so the following query is likely to fail. I particularly need more information on the products table to make this work, but assuming that you have SQL Server 2005 or higher, this might be enough to get you started in the right direction. It utilizes common table expressions along with the RANK function. I highly recommend learning about them, and in all likelihood, it will greatly improve the efficiency of the query.
;WITH cteTitle As (
SELECT
sequence = RANK() OVER (PARTITION BY bct.title_code ORDER BY Default_Flag desc)
,bct.title_code
,bc1.categoryid
FROM
tmp_business_class_titles bct
join Categories bc1 ON bc1.categoryid = bct.class_id
join Categories bc2 ON bc2.categoryid = bc1.highercategoryid
join Categories bc3 ON bc3.categoryid = bc2.highercategoryid
)
UPDATE
prod
SET
subcategoryid = ISNULL(t.categoryid,t2.categoryid)
FROM
products prod
LEFT join products subprod ON subprod.catalogid = prod.highercatalogid
LEFT join cteTitle t ON prod.ccode = t.title_code AND t.sequence = 1 AND prod.spdisplaytype = 'Table'
LEFT join cteTitle t2 ON subprod.ccode = t2.title_code And t2.sequence = 1 AND subprod.spdisplaytype = 'Table'
WHERE
t2.categoryid IS NOT NULL