Update Table with SUM from another table in db2 - db2

I have Two files FILE1 and FILE2. File1 has no duplicate records but file2 have duplicate records. The records in the file2 which are duplicate should sum up the Quantity( field2 in file2) and the sumup value should be updated in the File1.
File1= itnum, qtyavl
File2= itmnum, qtybln
Here i have tried using MERGE INTO function, it works perfectly but i don't want to use it because MERGE fuctionality can only be used from 7.1 version of ibmi.
I want to write the statement without using MERGE.
MERGE INTO file1 AS T
USING (SELECT itmnum, sum(qtybln) AS balance
FROM file2 GROUP BY itmnum) AS S
ON S.itmnum = T.itnum and qtyavl <> s.balance
WHEN MATCHED THEN UPDATE SET qtyavl = s.balance

Since 7.1 has been out of service for years, it is not unreasonable to require v7.1. Even v7.2 is effectively out of service, and only receiving bug fixes as of right now. But:
update file1 t
set qtyval = (select sum(qtybln) from file2 where itmnum = t.itnum)
where itmnum in (select itmnum from file2)
should work for you at any release. Note that the where clause in the update statement only affects which records to update. Unless qtyval can contain a null value, you only want to update the rows where select sum(qtybln) from file2 where itmnum = t.itmnum is going to return a non-null value. Or you could wrap the sub-select in a coalesce(), and assign a value of 0 where the sub-select returns a null.
Edit: If you only want to update the rowws where the QTYVAL needs to be changed, use this
update file1 t
set qtyval = (select sum(qtybln) from file2 where itmnum = t.itnum)
where itmnum in (select itmnum from file2)
and qtyval <> (select sum(qtybln) from file2 where itmnum = t.itnum)
In DB2 you cannot combine the row filter (where clause) with the values to be inserted in a single clause. This leads to what may seem like duplication, but the SQL optimizer is good at rewriting the SQL for performance.

Related

How to update counts by date in table A with the counts by date returned from join of table B and table C

I can do this using a temporary table. Is it possible to do these two steps in a single update query?
All possible dates already exist in the TargetTable (no inserts are necessary).
I'm hoping to make this more efficient since it is run often as batches of data periodically pour into table T2.
Table T1: list of individual dates inserted or updated in this batch
Table T2: datetime2(3) field followed by several data fields, may be thousands for any particular date
Goal: update TargetTable: date field followed by int field to hold the total records by date (may have just come in to T2 or may be additional records appended to existing records already in T2)
select T1.date as TargetDate, count(*) as CountF1
into #Temp
from T1 inner join T2
on T1.date = cast(T2.DateTime as date)
group by T1.date
update TargetTable
set TargetField1 = CountF1
from #Temp inner join TargetTable
on TargetDate = TargetTable.Date
I agree with the recommendation of Zohar Peled. Use a "Common Table Expression" which is often abbreviated as "CTE". A CTE can replace the temporary table in your scenario. You write a CTE by using the WITH keyword, and remember that in many cases you will need to have a semicolon before the WITH keyword (or at the end of the previous statement, if you prefer). The solution then looks like this:
;WITH CTE AS
(
SELECT T1.date AS TargetDate, Count(*) AS CountF1
FROM T1 INNER JOIN T2
ON T1.date = Cast(T2.DateTime AS DATE)
GROUP BY T1.date
)
UPDATE TargetTable
SET TargetField1 = CTE.CountF1
FROM CTE INNER JOIN TargetTable
ON CTE.TargetDate = TargetTable.Date;
Here is more information on Common Table Expressions:
https://learn.microsoft.com/en-us/sql/t-sql/queries/with-common-table-expression-transact-sql
After having done this, then another thing you might benefit from is to add a new column to table T2, with the datatype DATE. This new column could have the value of Cast(T2.DateTime AS DATE). It might even be a (persisted) computed column. Then add an index on that new column. If you then join on the new column (instead of joining on the expression Cast(...) ) it might run faster depending on the distribution of the data. The only way to tell if it runs faster is to try it out.

PostgreSQL: Deleting records to keep the one with the latest timestamps

Let's say I have a table with a column of timestamps and a column of IDs (numeric). For each ID, I'm trying to delete all the rows except the one with the latest timestamp.
Here is the code I have so far:
DELETE FROM table_name t1
WHERE EXISTS (SELECT * FROM table_name t2
WHERE t2."ID" = t1."ID"
AND t2."LOCAL_DATETIME_DTE" > t1."LOCAL_DATETIME_DTE")
This code seems to work, but my question is: why is it a > sign and not a < sign in the timestamp comparison? Is this not selecting for deletion all the rows with a later timestamp than another row? I thought this code would keep only the rows with the earliest timestamps for each ID.
You're using the EXISTS operator to delete records for which a record can be found with a larger, thus >, timestamp. For the newest, there isn't a record with a higher timestamp, so the WHERE clause doesn't resolve to true and therefore the record is kept.
You can use "record" pseudo-type to match tuples:
DELETE FROM table_name
WHERE (ID,LOCAL_DATETIME_DTE) not in
(SELECT ID,max(LOCAL_DATETIME_DTE) FROM table_name group by id);

How to drop oldest partition in PostgreSQL?

I can list the partitions with
SELECT
child.relname AS child_schema
FROM pg_inherits
JOIN pg_class child ON pg_inherits.inhrelid = child.oid ;
Is it guaranteed that they are listed in creation order? Because then only an additional LIMIT 1 is required. Else this will print the oldest, the one with the lowest number in its name: (my partitions are named name_1 name_2 name_3 ...)
SELECT
MIN ( trim(leading 'name_' from child.relname)::int ) AS child_schema
FROM pg_inherits
JOIN pg_class child ON pg_inherits.inhrelid = child.oid ;
Then I need to create a script which uses the result to execute DROP TABLE? Is there no easier way?
Is it guaranteed that they are listed in creation order?
No. This is likely as long as sequential scans and no dropped tables, but if you change the query and the plan changes, you could get rather unexpected results ordering-wise. Also I would expect that once free space is re-used, the ordering may change as well.
Your current trim query is the best way. Stick with it.

Purging records on AS/400

I have to delete some records that are considered to be useless.
There is a address file and an order history file. In the company which has a consumer products, they get many product inquiries or start of sale that never becomes a sale.
Each inquiry gets a record in the address file, Customer number. In the order history file is same customer numn and a Suffix field. which is started at 000 and increments when there is new order. the bulk of the business is in fact a recurring model.
A customer who has only '000' record (there could be multiple 000's), means they never bought anything we wish to purge them from these files.
I am thinking a simple RPG program but am also interested in just using SQL if that is possible or other methods.
At this stage, we would not actually be deleting but copying the proposed records for purging to an output file which will be reviewed and also would be stored in case a need to revert.
F Addressfile IF E
F OrderHistory IF E
**** create 2 output file clones but with altered name for now.
F Zaddressfile O E
F ZorderHistory O E
*inlr doweq *off
Read Addressfile lr
*inlr ifeq *off
move *off Flg000
exsr Chk000
Flg000 ifeq *on
iter
else exsr purge
endif
endif
enddo
Chk000 begsr
**basically setll to a different logical on the orderhistory
and reade for a long as we have the matching customer number
and if there is a suffix not = 000 then we turn on the flag
and get out.
the purge subr will have to read thru again to get the records needed to purge from the orderhistory file by using the same customer number that would still be in the read of the address file. because i would not be sure what value has the subr, for customer and i dont want to store that.
then it would write to the new file incl the address file then we can iter read the next
customer in address file.
also we cannot assume that if someone did buy, they have a 001 maybe it got deleted over the years.
if we did, i could simply chain on that.
All sorts of steps you have to do in RPG. This can be done in SQL a variety of simpler ways. SQL is adept at processing and analyzing groups of records in an entire file all at once.
CREATE TABLE zaddresses AS
( SELECT *
FROM addressFile
WHERE cust IN (SELECT cust
FROM orderHistory
GROUP BY cust
HAVING max(sufix)='000'
)
)
WITH DATA
NOT LOGGED INITIALLY;
CREATE TABLE zorderHst AS
( SELECT *
FROM orderHistory
WHERE cust IN (SELECT cust
FROM zaddresses
)
)
WITH DATA
NOT LOGGED INITIALLY;
There, you've defined your holding table and populated it in one single statement each. It does have some nested logic, but nonetheless only two statements.
To purge them
DELETE FROM addressfile
WHERE cust IN (SELECT cust FROM zaddresses);
DELETE FROM orderHistory
WHERE cust IN (SELECT cust FROM zaddresses);
A grand total of four SQL statements. (I wont even ask how many you'd have in your RPG program)
Once you understand SQL, you can think about processing entire files, not just record by record instructions. It's much simpler to get things done, and it's almost always faster when done well.
(You may hear arguments about performance under particular circumstances, but most often they simply aren't using SQL as well as they should. If you write poor RPG, it performs badly too. ;-)
I would use SQL.
-- Save only the rows to be deleted
CREATE TABLE ZADDRESSFILE AS
(SELECT *
FROM ADDRESSFILE af
WHERE NOT EXISTS
(SELECT 1
FROM ADDRESSFILE sub
WHERE sub.CUSTNO = af.CUSTNO
AND sub.SUFFIX <> '000' -- (or <> 0 if numeric)
)
)
-- If ZADDRESSFILE exists and you want to add the rows
-- to ZADDRESSFILE instead....
INSERT INTO ZADDRESSFILE
(SELECT *
FROM ADDRESSFILE af
WHERE NOT EXISTS
(SELECT 1
FROM ADDRESSFILE sub
WHERE sub.CUSTNO = af.CUSTNO
AND sub.SUFFIX <> '000' -- (or <> 0 if numeric)
)
AND OT EXISTS
(SELECT 1
FROM ZADDRESSFILE sub
WHERE sub.CUSTNO = af.CUSTNO
)
)
-- Get number of rows to be deleted
SELECT COUNT(*)
FROM ADDRESSFILE af
WHERE NOT EXISTS
(SELECT 1
FROM ADDRESSFILE sub
WHERE sub.CUSTNO = af.CUSTNO
AND sub.SUFIX <> '000'
)
-- Delete 'em
DELETE
FROM ADDRESSFILE af
WHERE NOT EXISTS
(SELECT 1
FROM ADDRESSFILE sub
WHERE sub.CUSTNO = af.CUSTNO
AND sub.SUFIX <> '000'
)

select distinct from 2 columns but only 1 is duplicate

select a.subscriber_msisdn, war.created_datetime from
(
select distinct subscriber_msisdn from wiz_application_response
where application_item_id in
(select id from wiz_application_item where application_id=155)
and created_datetime between '2012-10-07 00:00' and '2012-11-15 00:00:54'
) a
left outer join wiz_application_response war on (war.subscriber_msisdn=a.subscriber_msisdn)
the sub select returns 11 rows but when joined return 18 (with duplicates). The objective of this query is only add the date column to the 11 rows of the sub select.
Based on your description, it stands to reason that there are multiple created_datetime values for some of the subscriber_msisdn values which is what prompted you to use the distinct in the subquery to begin with. By joining the sub query to the original table you are defeating this. A cleaner way to write the query would be:
SELECT
war.subscriber_msisdn
, war.created_datetime
FROM
wiz_application_response war
LEFT JOIN wiz_application_item wai
ON war.application_item_id = wai.id
AND wai.application_id = 155
WHERE
war.created_datetime BETWEEN '2012-10-07 00:00' AND '2012-11-15 00:00:54'
This should return only the rows from the war table that satisfy the criteria based on the wai table. It should not be and outer join unless you wanted to return all the rows from war table that satisfied the created_datetime parameter regardless of the application_item_id parameter.
This is my best guess based on the limited information I have about your tables and what I’m assuming you’re trying to accomplish. If this doesn’t get you what you are after, I will continue to offer other ideas based on additional information you could provide. Hope this works.
Can most probably simplified to this:
SELECT DISTINCT ON (1)
r.subscriber_msisdn, r.created_datetime
FROM wiz_application_item i
JOIN wiz_application_response r ON r.application_item_id = i.id
WHERE i.application_id = 155
AND i.created_datetime BETWEEN '2012-10-07 00:00' AND '2012-11-15 00:00:54'
ORDER BY 1, 2 DESC -- to pick the latest created_datetime
Details depend on missing information.
More explanation here.