Please help transform Tsql "implicit joins" into explicit ones - tsql

Sorry, I am pretty much an SQL noob. This has to work in MSFT SQL, Oracle as well as Sybase. In the following snippet I need to change an inner join between IJ and KL on IJ.PO_id = KL.PO_id into a left join also on IJ.PO_id = KL.PO_id. So, I believe I have to re-factor this. Well, implicit joins are not the most readable, at least in my co-worker's eyes. I guess I will agree until I develop my own taste. Sorry, I mangled the table and field names just in case.
/* #IJ_id is an input stored proc patrameter. */
from AB,
CD,
EF,
GH,
IJ,
KL
where
EF.EF_id = IJ.EF_id and
IJ.EF_id = AB.EF_id and
EF.ZY_id = IJ.ZY_id and
IJ.ZY_id = AB.ZY_id and
IJ.IJ_id = AB.IJ_id and
IJ.IJ_id = #IJ_id and
EF.XW_id = GH.GH_id and
AB.VU_code = CD.VU_code and
IJ.TS > 0 and
IJ.RQ = 0 and
EF.RQ = 0 and
AB.RQ = 0 and
IJ.PO_id = KL.PO_id;
Now, my difficulty is that there is a lot going on in the where clause. Things that do not look like a.b = c.d will remain in the where clause, but not all stuff that does look like a.b = c.d look easy to convert into an explicit join. The difficult part is that ideally the conditions would be between neighbors - AB+CD, CD+EF, EF+GH, GH+IJ, IJ+KL but they are not that organized right now. I could re-order some, but ultimately I do not want to forget my goal: I want the new query to be no slower, and I want the new query to be no less readable. It seems that I might be better off hacking just the part that I need to change, and leave it mostly the same. I am not sure if I can do that.
If you understood my intent, please suggest a better query. if you did not, then please tell me how I can improve the question. Thanks.

I think it should be something like this:
FROM AB
JOIN CD ON AB.VU_code = CD.VU_code
JOIN IJ ON IJ.EF_id = AB.EF_id AND IJ.ZY_id = AB.ZY_id AND IJ.IJ_id = AB.IJ_id
JOIN EF ON EF.EF_id = IJ.EF_id AND EF.ZY_id = IJ.ZY_id
JOIN GH ON EF.XW_id = GH.GH_id
JOIN KL ON IJ.PO_id = KL.PO_id
WHERE
IJ.IJ_id = #IJ_id AND
IJ.TS > 0 AND
IJ.RQ = 0 AND
EF.RQ = 0 AND
AB.RQ = 0
I have tried to arrange the tables such that the following rules hold:
Every join condition mentions the new table that it joining on one side.
No table is mentioned in a join condition if that table has not been joined yet.
Conditions where one of the operands is a constant are left as a WHERE condition.
The last rule is a difficult one - it is not possible to tell from your mangled names whether a condition ought to be part of a join or part of the where clause. Both will give the same result for an INNER JOIN. Whether the condition should be part of the join or part of the where clause depends on the semantics of the relationship between the tables.
You need to consider each condition on a case-by-case basis:
Does it define the relationship between the two tables? Put it in the JOIN.
Is it a filter on the results? Put it in the WHERE clause.
Some guidelines:
A condition that includes a parameter from the user is unlikely to be something that should be moved to a join.
Inequalities are not usually found in join conditions.

It couldn't possibly get any less readable than the example you gave...
from AB a
join CD c on a.VU_Code = c.VU_Code
join EF e on a.EF_id = e.EF_id and e.RQ = 0
join GH g on e.XW_id = g.GH_id
join IJ i on a.IJ_id = i.IJ_id and e.EF_id = i.EF_id
and a.EF_id = i.EF_id and e.ZY_id = i.ZY_id
and a.ZY_id = i.ZY_id and i.TS > 0 and i.RQ = 0
LEFT join KL k on i.PO_id = k.PO_id
where
i.IJ_id = #IJ_id and
a.RQ = 0

Use:
FROM AB t1
JOIN CD t2 ON t2.VU_code = t1.VU_code
JOIN GH t4 ON t4.gh_id = t3.xw_id
JOIN IJ t5 ON t5.ZY_id = t1.ZY_id
AND t5.IJ_id = t1.IJ_id
AND t5.EF_id = t1.EF_id
AND t5.IJ_id = #IJ_id
AND t5.TS > 0
AND t5.RQ = 0
JOIN EF t3 ON t3.ef_id = t5.ef_id
AND t3.zy_id = t5.zy_id
AND t3.RQ = 0
JOIN KL t6 ON t6.po_id = t5.po_id -- Add LEFT before JOIN for LEFT JOIN
WHERE ab.qu = 0
They're aliased in the sequence of the original ANSI-89 syntax, but the order is adjusted due to alias reference - can't reference a table alias before it's been defined.
This is ANSI-92 JOIN syntax - there's no performance benefit, but it does mean that OUTER join syntax is consistent. Just have to add LEFT before the "JOIN KL ..." to turn that into a LEFT JOIN.

Related

How to perform WHERE in with multiple columns in postgres

I wants to do something like this,
SELECT * FROM product p
JOIN product_version pv ON p.id = pv.product_id
where (p.code, pv.product_version) in (("FF6",1), ("FF12", 1));
But this is giving error at in clause.
Can someone provide the correct syntax.
You are not providing any information about the actual error neither about column types.
But, by the way, it really looks like that those double quotes are wrong because in Postgres strings are quoted using simple quotes ('), not double (").
Try:
SELECT *
FROM product p
JOIN product_version pv ON (p.id = pv.product_id)
where
(p.code, pv.product_version) in (('FF6',1), ('FF12', 1))
;
Despite that, your query looks syntactically "correct" unless some kind of type mismatching we cannot foresee without more information.
You probably can't go with IN, depending on your goal you need to do something like:
where (p.code = "FF6" and pv.product_version = 1) or
(p.code = "FF12" and pv.product_version = 1)
or, if the logic above was not what you meant, maybe:
where p.code IN ("FF6", "FF12} AND pv.product_version IN (1)
or
where p.code IN ("FF6", "FF12} OR pv.product_version IN (1)
This code should work for you
SELECT * FROM product p
JOIN product_version pv ON p.id = pv.product_id
where p.code in("FF6","FF12") and pv.product_version = 1

Optimizing Postgres query with timestamp filter

I have a query:
SELECT DISTINCT ON (analytics_staging_v2s.event_type, sent_email_v2s.recipient, sent_email_v2s.sent) sent_email_v2s.id, sent_email_v2s.user_id, analytics_staging_v2s.event_type, sent_email_v2s.campaign_id, sent_email_v2s.recipient, sent_email_v2s.sent, sent_email_v2s.stage, sent_email_v2s.sequence_id, people.role, people.company, people.first_name, people.last_name, sequences.name as sequence_name
FROM "sent_email_v2s"
LEFT JOIN analytics_staging_v2s ON sent_email_v2s.id = analytics_staging_v2s.sent_email_v2_id
JOIN people ON sent_email_v2s.person_id = people.id
JOIN sequences on sent_email_v2s.sequence_id = sequences.id
JOIN users ON sent_email_v2s.user_id = users.id
WHERE "sent_email_v2s"."status" = 1
AND "people"."person_type" = 0
AND (sent_email_v2s.sequence_id = 1888) AND (sent_email_v2s.sent >= '2016-03-18')
AND "users"."team_id" = 1
When I run EXPLAIN ANALYZE on it, I get:
Then, if I change that to the following (Just removing the (sent_email_v2s.sent >= '2016-03-18')) as follows:
SELECT DISTINCT ON (analytics_staging_v2s.event_type, sent_email_v2s.recipient, sent_email_v2s.sent) sent_email_v2s.id, sent_email_v2s.user_id, analytics_staging_v2s.event_type, sent_email_v2s.campaign_id, sent_email_v2s.recipient, sent_email_v2s.sent, sent_email_v2s.stage, sent_email_v2s.sequence_id, people.role, people.company, people.first_name, people.last_name, sequences.name as sequence_name
FROM "sent_email_v2s"
LEFT JOIN analytics_staging_v2s ON sent_email_v2s.id = analytics_staging_v2s.sent_email_v2_id
JOIN people ON sent_email_v2s.person_id = people.id
JOIN sequences on sent_email_v2s.sequence_id = sequences.id
JOIN users ON sent_email_v2s.user_id = users.id
WHERE "sent_email_v2s"."status" = 1
AND "people"."person_type" = 0
AND (sent_email_v2s.sequence_id = 1888) AND "users"."team_id" = 1
when I run EXPLAIN ANALYZE on this query, the results are:
EDIT:
The results above from today are about as I expected. When I ran this last night, however, the difference created by including the timestamp filter was about 100x slower (0.5s -> 59s). The EXPLAIN ANALYZE from last night showed all of the time increase to be attributed to the first unique/sort operation in the query plan above.
Could there be some kind of caching issue here? I am worried now that there might be something else going on (transiently) that might make this query take 100x longer since it happened at least once.
Any thoughts are appreciated!

How to create variables in derived table using different conditions

I want to generate a table and all the variables I need are from a derived temporary table. Now, I need to calculate the time differences under different conditions, and I need to deal with 2 questions:
1. how to create variables in derived table.
2. how to split a variable into 2 variables using different conditions.
Please note that the sql statement I provide below is the simplified statement and please pay attention to the comments which will help you understand the question.
Thanks in advance for any tips.
Here is the SQL statement:
Select id, name, offline_time, Process_time from
## in the derived table, use the difference of createon in table a and b to calculte the offline_time and Process_time ##
## when {a.id = st.id AND a.activityid = 5008 and a.sessiontype = 7} a.creaton = a.creaton1 ##
## when {b.activityid = 5011} b.creaton = b.creaton1 ##
(select
(UNIX_TIMESTAMP(MAX(a.createdon)) - UNIX_TIMESTAMP(MAX(b.createdon))) as 'Offline_Time',
(UNIX_TIMESTAMP(MAX(a.createdon1)) - UNIX_TIMESTAMP(MAX(b.createdon1))) as 'Process_Time',
q.id,
q.name,
a.sessiontype
FROM tv_sessiontimer st
LEFT JOIN sessionactivity_log a ON (a.id = st.id AND a.activityid in (5004,5008) and a.sessiontype IN (3,4,5,6,7))
LEFT JOIN sessionactivity_log b ON (a.ssessionId = b.sessionId AND a.id = b.id AND b.activityid in (5003,5011))
INNER JOIN tv_offline_request q ON q.id = a.sessionid
LEFT JOIN tv_subject sub ON (sub.id = q.subjectid AND sub.SortKey > 10800 AND sub.sortkey < 30200)
)

Cannot get view to work on SQL Server

I am trying to create a view that includes columns froms several tables.
This is what it looks like:
And this is my query:
SELECT
Billing.WebPortalBilling.WebPortalBillingId,
Billing.WebPortalBilling.CorporationId,
Billing.WebPortalBilling.TokenId,
Billing.WebPortalBilling.GatewaySupportFee,
Billing.WebPortalBilling.GatewayPerTransactionFee,
Billing.WebPortalBilling.PortalPerCustomerFee,
Billing.WebPortalBilling.PortalSupportFee,
Customer.Account.AccountNumber,
Billing.WebPortalBilling.IsActive,
Customer.Customer.Name,
Customer.Customer.TaxCode,
Company.CorporationStructure.Branch
FROM
Company.CorporationStructure
RIGHT OUTER JOIN
Customer.Account ON Company.CorporationStructure.CorporationStructureId = Customer.Account.CorporationStructureId
RIGHT OUTER JOIN
Customer.Customer ON Company.CorporationStructure.Branch = Customer.Customer.Branch
RIGHT OUTER JOIN
Billing.WebPortalBilling ON Customer.Account.CorporationId = Billing.WebPortalBilling.CorporationId
WHERE
(Billing.WebPortalBilling.IsActive = 1)
It's only returning 1 record, which is not correct. I'm trying to tie the Customer's name back to the WebPortalBilling table along with the account number and branth in the other two tables.
I'm new to sql, so be kind.
Thanks!
As commented the where is killing the outer
Try
SELECT
Billing.WebPortalBilling.WebPortalBillingId,
Billing.WebPortalBilling.CorporationId,
Billing.WebPortalBilling.TokenId,
Billing.WebPortalBilling.GatewaySupportFee,
Billing.WebPortalBilling.GatewayPerTransactionFee,
Billing.WebPortalBilling.PortalPerCustomerFee,
Billing.WebPortalBilling.PortalSupportFee,
Customer.Account.AccountNumber,
Billing.WebPortalBilling.IsActive,
Customer.Customer.Name,
Customer.Customer.TaxCode,
Company.CorporationStructure.Branch
FROM
Company.CorporationStructure
RIGHT OUTER JOIN
Customer.Account ON Company.CorporationStructure.CorporationStructureId = Customer.Account.CorporationStructureId
RIGHT OUTER JOIN
Customer.Customer ON Company.CorporationStructure.Branch = Customer.Customer.Branch
RIGHT OUTER JOIN Billing.WebPortalBilling
ON Customer.Account.CorporationId = Billing.WebPortalBilling.CorporationId
AND Billing.WebPortalBilling.IsActive = 1
Try this, I think left joins are clearer.
SELECT
B.WebPortalBillingId,
B.CorporationId,
B.TokenId,
B.GatewaySupportFee,
B.GatewayPerTransactionFee,
B.PortalPerCustomerFee,
B.PortalSupportFee,
C.AccountNumber,
B.IsActive,
C.Name,
C.TaxCode,
CS.Branch
FROM Customer.Customer C
LEFT JOIN Company.CorporationStructure CS ON CS.Branch = C.Branch
LEFT JOIN Customer.Account A ON CS.CorporationStructureId = A.CorporationStructureId
LEFT JOIN Billing.WebPortalBilling B ON A.CorporationId = B.CorporationId
WHERE B.IsActive = 1

DB2 V9 ZOS - Performance tuning

Background
Currently I am using DB2 V9 version. One of my stored procedure is taking time to execute. I looked BMC apptune and found the following SQL.
There are three tables we were using to execute the following query.
ACCOUNT table is having 3413 records
EXCHANGE_RATE is having 1267K records
BALANCE is having 113M records
Someone has added recently following piece of code in the query. I think because of this we had a problem.
AND (((A.ACT <> A.EW_ACT)
AND (A.EW_ACT <> ' ')
AND (C.ACT = A.EW_ACT))
OR (C.ACT = A.ACT))
Query
SELECT F1.CLO_LED
INTO :H :H
FROM (SELECT A.ACT, A.BNK, A.ACT_TYPE,
CASE WHEN :H = A.CUY_TYPE THEN DEC(C.CLO_LED, 21, 2)
ELSE DEC(MULTIPLY_ALT(C.CLO_LED, COALESCE(B.EXC_RATE, 0)), 21, 2)
END AS CLO_LED
FROM ACCOUNT A
LEFT OUTER JOIN EXCHANGE_RATE B
ON B.EFF_DATE = CURRENT DATE - 1 DAY
AND B.CURCY_FROM = A.CURNCY_TYPE
AND B.CURCY_TO = :H
AND B.STA_TYPE = 'A'
, BALANCE C
WHERE A.CUSR_ID = :DCL.CUST-ID
AND A.ACT = :DCL.ACT
AND A.EIG_RTN = :WS-BNK-ID
AND A.ACT_TYPE = :DCL.ACT-TYPE
AND A.ACT_CAT = :DCL.ACT-CAT
AND A.STA_TYPE = 'A'
AND (((A.ACT <> A.EW_ACT)
AND (A.EW_ACT <> ' ')
AND (C.ACT = A.EW_ACT))
OR (C.ACT = A.ACT))
AND C.BNK = :WS-BNK-ID
AND C.ACT_TYPE = :DCL.ACT-TYPE
AND C.BUS_DATE = :WS-DATE-FROM) F1
WITH UR
There's a number of wierd things going on in this query. The most twitchy of which is mixing explicit joins with the implicit-join syntax; frankly, I'm not certain how the system interprets it. You also appear to be using the same host-variable for both input and output; please don't.
Also, why are your column names so short? DB2 (that version, at least) supports column names that are much longer. Please save people's sanity, if at all possible.
We can't completely say why things are slow - we may need to see access plans. In the meantime, here's your query, restructured to what may be a faster form:
SELECT CASE WHEN :inputType = a.cuy_type THEN DEC(b.clo_led, 21, 2)
ELSE DEC(MULTIPLY_ALT(b.clo_led, COALESCE(c.exc_rate, 0)), 21, 2) END
INTO :amount :amountIndicator -- if you get results, do you need the indiciator?
FROM Account as a
JOIN Balance as b -- This is assumed to not be a 'left', given coalesce not used
ON b.bnk = a.eig_rtn
AND b.act_type = a.act_type
AND b.bus_date = :ws-date-from
AND ((a.act <> a.ew_act -- something feels wrong here, but
AND a.ew_act <> ' ' -- without knowing the data, I don't
AND c.act = a.ew_act) -- want to muck with it.
OR c.act = a.act)
LEFT JOIN Exchange_Rate as c
ON c.eff_date = current_date - 1 day
AND c.curcy_from = a.curncy_type
AND c.sta_type = a.sta_type
AND c.curcy_to = :destinationCurrency
WHERE a.cusr_id = :dcl.cust-id
AND a.act = :dcl.act
AND a.eig_rtn = :ws-bnk-id
AND a.act_type = :dcl.act-type
AND a.act_cat = :dcl.act-cat
AND a.sta_type = 'A'
WITH UR
FECTCH FIRST 1 ROW ONLY
A few other notes:
Only specify exactly those columns needed - under certain circumstances, this permits index-only access, where otherwise a followup table-access may be needed. However, this probably won't help here.
COALESCE(c.exc_rate, 0) feels off somehow - if no exchange rate is present, you return an amount of 0, which could otherwise be a valid amount. You may need to return some sort of indicator, or make it a normal join, not an outer one.
Also, try both this version, and possibly a version where host variables are specified in addition to the conditions between tables. The optimizer should be able to automatically commute the values, but may not under some conditions (implementation detail).