I have a query:
SELECT DISTINCT ON (analytics_staging_v2s.event_type, sent_email_v2s.recipient, sent_email_v2s.sent) sent_email_v2s.id, sent_email_v2s.user_id, analytics_staging_v2s.event_type, sent_email_v2s.campaign_id, sent_email_v2s.recipient, sent_email_v2s.sent, sent_email_v2s.stage, sent_email_v2s.sequence_id, people.role, people.company, people.first_name, people.last_name, sequences.name as sequence_name
FROM "sent_email_v2s"
LEFT JOIN analytics_staging_v2s ON sent_email_v2s.id = analytics_staging_v2s.sent_email_v2_id
JOIN people ON sent_email_v2s.person_id = people.id
JOIN sequences on sent_email_v2s.sequence_id = sequences.id
JOIN users ON sent_email_v2s.user_id = users.id
WHERE "sent_email_v2s"."status" = 1
AND "people"."person_type" = 0
AND (sent_email_v2s.sequence_id = 1888) AND (sent_email_v2s.sent >= '2016-03-18')
AND "users"."team_id" = 1
When I run EXPLAIN ANALYZE on it, I get:
Then, if I change that to the following (Just removing the (sent_email_v2s.sent >= '2016-03-18')) as follows:
SELECT DISTINCT ON (analytics_staging_v2s.event_type, sent_email_v2s.recipient, sent_email_v2s.sent) sent_email_v2s.id, sent_email_v2s.user_id, analytics_staging_v2s.event_type, sent_email_v2s.campaign_id, sent_email_v2s.recipient, sent_email_v2s.sent, sent_email_v2s.stage, sent_email_v2s.sequence_id, people.role, people.company, people.first_name, people.last_name, sequences.name as sequence_name
FROM "sent_email_v2s"
LEFT JOIN analytics_staging_v2s ON sent_email_v2s.id = analytics_staging_v2s.sent_email_v2_id
JOIN people ON sent_email_v2s.person_id = people.id
JOIN sequences on sent_email_v2s.sequence_id = sequences.id
JOIN users ON sent_email_v2s.user_id = users.id
WHERE "sent_email_v2s"."status" = 1
AND "people"."person_type" = 0
AND (sent_email_v2s.sequence_id = 1888) AND "users"."team_id" = 1
when I run EXPLAIN ANALYZE on this query, the results are:
EDIT:
The results above from today are about as I expected. When I ran this last night, however, the difference created by including the timestamp filter was about 100x slower (0.5s -> 59s). The EXPLAIN ANALYZE from last night showed all of the time increase to be attributed to the first unique/sort operation in the query plan above.
Could there be some kind of caching issue here? I am worried now that there might be something else going on (transiently) that might make this query take 100x longer since it happened at least once.
Any thoughts are appreciated!
Related
I have this merge query in oracle and it was working fine. Now we are migrating to postgres 10 and trying to find equivalent for this in postgres.
MERGE INTO s.act_pack C USING((SELECT A.jid, A.pid, B.pcode,
B.mc, A.md, A.hd FROM s.act_pack A INNER JOIN s.act_pack B
ON A.pid = B.pid AND A.pcode = B.mc AND (A.hd <> B.hd
OR A.md<> B.md)) order by A.upd_ts desc) D ON(C.pid = D.pid AND
C.pcode = D.pcode AND C.jid = D.jid) WHEN MATCHED THEN UPDATE SET C.md =
D.md, C.hd= D.hd;
I see some forums on web says postgres doesnt support merge, and use INSERT ... ON CONFLICT
but with no background in postgres, I am not able to understand how this complex query can be written using that.
And some says postgres9.5 and above support merge statement. since we are using postgres 10 tried to use same oracle query in postgres but recieved ERROR: syntax error at or near "MERGE"
Any help is highly appreciated.
You don't need an "UPSERT" as you are not doing an INSERT, so a regular UPDATE is enough:
update act_pack C
SET C.md = D.md,
C.hd = D.h
from (
SELECT A.jid, A.pid, B.pcode, B.mc, A.md, A.hd
FROM s.act_pack A
INNER JOIN s.act_pack B
ON A.pid = B.pid
AND A.pcode = B.mc
AND (A.hd <> B.hd OR A.md<> B.md)
) d
where C.pid = D.pid
AND C.pcode = D.pcode
AND C.jid = D.jid
This is a direct "translation" of your code. But the fact that the same table is used three times is a bit strange. But without more information it's hard to know where exactly this could be made more efficient.
SELECT clm.CLCL_PAYEE_PR_ID, clm.SBSB_CK, clm.CLCL_ID, clm.clcl_id_adj_to,clm.clcl_id_adj_from, clm.CLCL_PAID_DT
FROM ODW.DW.fac_cmc_clcl_claim CLM
INNER JOIN ODW.DW.fac_cmc_meme_member MEME ON MEME.meme_ck = CLM.meme_ck
INNER JOIN ODW.DW.fac_cmc_mepe_prcs_elig MEPE ON MEPE.meme_ck = MEME.meme_ck
INNER JOIN ODW.DW.fac_cmc_mepr_prim_prov MEPR ON MEPE.meme_ck = MEPR.meme_ck AND CLM.clcl_prpr_id_pcp = MEPR.prpr_id
INNER JOIN ODW.DW.fac_cmc_sbsb_subsc SBSB ON MEME.sbsb_ck = SBSB.sbsb_ck
INNER JOIN ODW.DW.fac_cmc_prpr_prov PROV ON MEPR.prpr_id = PROV.prpr_id AND PROV.prpr_mctr_prty = 'RISK'
INNER JOIN ODW.DW.fac_cmc_prer_relation PRER ON PRER.prpr_id = MEPR.prpr_id
INNER JOIN ODW.DW.fac_cmc_plds_plan_desc PLDS ON MEPE.cspi_id = PLDS.cspi_id
INNER JOIN ODW.DW.fac_cmc_pdds_prod_desc PDDS ON MEPE.pdpd_id = PDDS.pdpd_id
WHERE CLM.clcl_paid_dt BETWEEN '2019-12-24 00:00:00.000' AND '2019-12-30 23:59:59.997'
AND CLM.clcl_cur_sts = '02'
AND CLM.clcl_cl_type = 'M'
AND CLM.clcl_cl_sub_type = 'H'
AND CLM.grgr_ck IN (46)
AND MEPR.grgr_ck IN (46)
AND MEPE.grgr_ck IN (46)
AND MEPE.mepe_elig_ind = 'Y'
AND CLM.clcl_low_svc_dt BETWEEN MEPE.mepe_eff_dt AND MEPE.mepe_term_dt
AND CLM.clcl_low_svc_dt BETWEEN MEPR.mepr_eff_dt AND MEPR.mepr_term_dt
AND SBSB.grgr_ck IN (46)
AND PRER.prer_prpr_entity = 'I'
AND PRER.prer_prpr_id IN ('64456546')
AND (PLDS.plds_desc LIKE '%risk%' OR PDDS.pdds_desc LIKE '%risk%');
This query runs in PROD with different variables which substitute the value of the hard coded values. It runs around 100 times per day in PROD and on some days some of the runs fail due to this error:
The multi-part identifier "PDDS.pdds_desc" could not be bound
Please note that all the joins are being done on views.
When I re-run the failed process, it succeeds the second time with no changes to the underlying query.
Can anyone suggest what could be the issue. Also, any performance optimization suggestions for this query query will be appreciated.
Thanks!
Background
Currently I am using DB2 V9 version. One of my stored procedure is taking time to execute. I looked BMC apptune and found the following SQL.
There are three tables we were using to execute the following query.
ACCOUNT table is having 3413 records
EXCHANGE_RATE is having 1267K records
BALANCE is having 113M records
Someone has added recently following piece of code in the query. I think because of this we had a problem.
AND (((A.ACT <> A.EW_ACT)
AND (A.EW_ACT <> ' ')
AND (C.ACT = A.EW_ACT))
OR (C.ACT = A.ACT))
Query
SELECT F1.CLO_LED
INTO :H :H
FROM (SELECT A.ACT, A.BNK, A.ACT_TYPE,
CASE WHEN :H = A.CUY_TYPE THEN DEC(C.CLO_LED, 21, 2)
ELSE DEC(MULTIPLY_ALT(C.CLO_LED, COALESCE(B.EXC_RATE, 0)), 21, 2)
END AS CLO_LED
FROM ACCOUNT A
LEFT OUTER JOIN EXCHANGE_RATE B
ON B.EFF_DATE = CURRENT DATE - 1 DAY
AND B.CURCY_FROM = A.CURNCY_TYPE
AND B.CURCY_TO = :H
AND B.STA_TYPE = 'A'
, BALANCE C
WHERE A.CUSR_ID = :DCL.CUST-ID
AND A.ACT = :DCL.ACT
AND A.EIG_RTN = :WS-BNK-ID
AND A.ACT_TYPE = :DCL.ACT-TYPE
AND A.ACT_CAT = :DCL.ACT-CAT
AND A.STA_TYPE = 'A'
AND (((A.ACT <> A.EW_ACT)
AND (A.EW_ACT <> ' ')
AND (C.ACT = A.EW_ACT))
OR (C.ACT = A.ACT))
AND C.BNK = :WS-BNK-ID
AND C.ACT_TYPE = :DCL.ACT-TYPE
AND C.BUS_DATE = :WS-DATE-FROM) F1
WITH UR
There's a number of wierd things going on in this query. The most twitchy of which is mixing explicit joins with the implicit-join syntax; frankly, I'm not certain how the system interprets it. You also appear to be using the same host-variable for both input and output; please don't.
Also, why are your column names so short? DB2 (that version, at least) supports column names that are much longer. Please save people's sanity, if at all possible.
We can't completely say why things are slow - we may need to see access plans. In the meantime, here's your query, restructured to what may be a faster form:
SELECT CASE WHEN :inputType = a.cuy_type THEN DEC(b.clo_led, 21, 2)
ELSE DEC(MULTIPLY_ALT(b.clo_led, COALESCE(c.exc_rate, 0)), 21, 2) END
INTO :amount :amountIndicator -- if you get results, do you need the indiciator?
FROM Account as a
JOIN Balance as b -- This is assumed to not be a 'left', given coalesce not used
ON b.bnk = a.eig_rtn
AND b.act_type = a.act_type
AND b.bus_date = :ws-date-from
AND ((a.act <> a.ew_act -- something feels wrong here, but
AND a.ew_act <> ' ' -- without knowing the data, I don't
AND c.act = a.ew_act) -- want to muck with it.
OR c.act = a.act)
LEFT JOIN Exchange_Rate as c
ON c.eff_date = current_date - 1 day
AND c.curcy_from = a.curncy_type
AND c.sta_type = a.sta_type
AND c.curcy_to = :destinationCurrency
WHERE a.cusr_id = :dcl.cust-id
AND a.act = :dcl.act
AND a.eig_rtn = :ws-bnk-id
AND a.act_type = :dcl.act-type
AND a.act_cat = :dcl.act-cat
AND a.sta_type = 'A'
WITH UR
FECTCH FIRST 1 ROW ONLY
A few other notes:
Only specify exactly those columns needed - under certain circumstances, this permits index-only access, where otherwise a followup table-access may be needed. However, this probably won't help here.
COALESCE(c.exc_rate, 0) feels off somehow - if no exchange rate is present, you return an amount of 0, which could otherwise be a valid amount. You may need to return some sort of indicator, or make it a normal join, not an outer one.
Also, try both this version, and possibly a version where host variables are specified in addition to the conditions between tables. The optimizer should be able to automatically commute the values, but may not under some conditions (implementation detail).
I have a SQL 2008 R2 database with some tables on it having some of those tables a Full-Text Index defined. I'd like to know how to determine the size of the index of a specific table, in order to control and predict it's growth.
Is there a way of doing this?
The catalog view sys.fulltext_index_fragments keeps track of the size of each fragment, regardless of catalog, so you can take the SUM this way. This assumes the limitation of one full-text index per table is going to remain the case. The following query will get you the size of each full-text index in the database, again regardless of catalog, but you could use the WHERE clause if you only care about a specific table.
SELECT
[table] = OBJECT_SCHEMA_NAME(table_id) + '.' + OBJECT_NAME(table_id),
size_in_KB = CONVERT(DECIMAL(12,2), SUM(data_size/1024.0))
FROM sys.fulltext_index_fragments
-- WHERE table_id = OBJECT_ID('dbo.specific_table_name')
GROUP BY table_id;
Also note that if the count of fragments is high you might consider a reorganize.
If you are after a specific Catalogue
Use SSMS
- Clik on [Database] and expand the objects
- Click on [Storage]
- Right Click on {Specific Catalogue}
- Choose Propertie and click.
IN General TAB.. You will find the Catalogue Size = 'nn'
I use something similar to this (which will also calculate the size of XML-indexes, ... if present)
SELECT S.name,
SO.name,
SIT.internal_type_desc,
rows = CASE WHEN GROUPING(SIT.internal_type_desc) = 0 THEN SUM(SP.rows)
END,
TotalSpaceGB = SUM(SAU.total_pages) * 8 / 1048576.0,
UsedSpaceGB = SUM(SAU.used_pages) * 8 / 1048576.0,
UnusedSpaceGB = SUM(SAU.total_pages - SAU.used_pages) * 8 / 1048576.0,
TotalSpaceKB = SUM(SAU.total_pages) * 8,
UsedSpaceKB = SUM(SAU.used_pages) * 8,
UnusedSpaceKB = SUM(SAU.total_pages - SAU.used_pages) * 8
FROM sys.objects SO
INNER JOIN sys.schemas S ON S.schema_id = SO.schema_id
INNER JOIN sys.internal_tables SIT ON SIT.parent_object_id = SO.object_id
INNER JOIN sys.partitions SP ON SP.object_id = SIT.object_id
INNER JOIN sys.allocation_units SAU ON (SAU.type IN (1, 3)
AND SAU.container_id = SP.hobt_id)
OR (SAU.type = 2
AND SAU.container_id = SP.partition_id)
WHERE S.name = 'schema'
--AND SO.name IN ('TableName')
GROUP BY GROUPING SETS(
(S.name,
SO.name,
SIT.internal_type_desc),
(S.name, SO.name), (S.name), ())
ORDER BY S.name,
SO.name,
SIT.internal_type_desc;
This will generally give numbers higher than sys.fulltext_index_fragments, but when combined with the sys.partitions of the table, it will add up to the numbers returned from EXEC sys.sp_spaceused #objname = N'schema.TableName';.
Tested with SQL Server 2016, but documentation says it should be present since 2008.
Sorry, I am pretty much an SQL noob. This has to work in MSFT SQL, Oracle as well as Sybase. In the following snippet I need to change an inner join between IJ and KL on IJ.PO_id = KL.PO_id into a left join also on IJ.PO_id = KL.PO_id. So, I believe I have to re-factor this. Well, implicit joins are not the most readable, at least in my co-worker's eyes. I guess I will agree until I develop my own taste. Sorry, I mangled the table and field names just in case.
/* #IJ_id is an input stored proc patrameter. */
from AB,
CD,
EF,
GH,
IJ,
KL
where
EF.EF_id = IJ.EF_id and
IJ.EF_id = AB.EF_id and
EF.ZY_id = IJ.ZY_id and
IJ.ZY_id = AB.ZY_id and
IJ.IJ_id = AB.IJ_id and
IJ.IJ_id = #IJ_id and
EF.XW_id = GH.GH_id and
AB.VU_code = CD.VU_code and
IJ.TS > 0 and
IJ.RQ = 0 and
EF.RQ = 0 and
AB.RQ = 0 and
IJ.PO_id = KL.PO_id;
Now, my difficulty is that there is a lot going on in the where clause. Things that do not look like a.b = c.d will remain in the where clause, but not all stuff that does look like a.b = c.d look easy to convert into an explicit join. The difficult part is that ideally the conditions would be between neighbors - AB+CD, CD+EF, EF+GH, GH+IJ, IJ+KL but they are not that organized right now. I could re-order some, but ultimately I do not want to forget my goal: I want the new query to be no slower, and I want the new query to be no less readable. It seems that I might be better off hacking just the part that I need to change, and leave it mostly the same. I am not sure if I can do that.
If you understood my intent, please suggest a better query. if you did not, then please tell me how I can improve the question. Thanks.
I think it should be something like this:
FROM AB
JOIN CD ON AB.VU_code = CD.VU_code
JOIN IJ ON IJ.EF_id = AB.EF_id AND IJ.ZY_id = AB.ZY_id AND IJ.IJ_id = AB.IJ_id
JOIN EF ON EF.EF_id = IJ.EF_id AND EF.ZY_id = IJ.ZY_id
JOIN GH ON EF.XW_id = GH.GH_id
JOIN KL ON IJ.PO_id = KL.PO_id
WHERE
IJ.IJ_id = #IJ_id AND
IJ.TS > 0 AND
IJ.RQ = 0 AND
EF.RQ = 0 AND
AB.RQ = 0
I have tried to arrange the tables such that the following rules hold:
Every join condition mentions the new table that it joining on one side.
No table is mentioned in a join condition if that table has not been joined yet.
Conditions where one of the operands is a constant are left as a WHERE condition.
The last rule is a difficult one - it is not possible to tell from your mangled names whether a condition ought to be part of a join or part of the where clause. Both will give the same result for an INNER JOIN. Whether the condition should be part of the join or part of the where clause depends on the semantics of the relationship between the tables.
You need to consider each condition on a case-by-case basis:
Does it define the relationship between the two tables? Put it in the JOIN.
Is it a filter on the results? Put it in the WHERE clause.
Some guidelines:
A condition that includes a parameter from the user is unlikely to be something that should be moved to a join.
Inequalities are not usually found in join conditions.
It couldn't possibly get any less readable than the example you gave...
from AB a
join CD c on a.VU_Code = c.VU_Code
join EF e on a.EF_id = e.EF_id and e.RQ = 0
join GH g on e.XW_id = g.GH_id
join IJ i on a.IJ_id = i.IJ_id and e.EF_id = i.EF_id
and a.EF_id = i.EF_id and e.ZY_id = i.ZY_id
and a.ZY_id = i.ZY_id and i.TS > 0 and i.RQ = 0
LEFT join KL k on i.PO_id = k.PO_id
where
i.IJ_id = #IJ_id and
a.RQ = 0
Use:
FROM AB t1
JOIN CD t2 ON t2.VU_code = t1.VU_code
JOIN GH t4 ON t4.gh_id = t3.xw_id
JOIN IJ t5 ON t5.ZY_id = t1.ZY_id
AND t5.IJ_id = t1.IJ_id
AND t5.EF_id = t1.EF_id
AND t5.IJ_id = #IJ_id
AND t5.TS > 0
AND t5.RQ = 0
JOIN EF t3 ON t3.ef_id = t5.ef_id
AND t3.zy_id = t5.zy_id
AND t3.RQ = 0
JOIN KL t6 ON t6.po_id = t5.po_id -- Add LEFT before JOIN for LEFT JOIN
WHERE ab.qu = 0
They're aliased in the sequence of the original ANSI-89 syntax, but the order is adjusted due to alias reference - can't reference a table alias before it's been defined.
This is ANSI-92 JOIN syntax - there's no performance benefit, but it does mean that OUTER join syntax is consistent. Just have to add LEFT before the "JOIN KL ..." to turn that into a LEFT JOIN.