Optimizing PgSQL Query function performance - postgresql

I really need advice on the below, trying to use DB function getpreviousorders..
SELECT o1.* FROM "sample"."order" o1 JOIN "sample".patient p1 ON o1.patient_id = p1.id
WHERE o1.sample_id != _sampleId AND p1.mrn = _mrn
AND ((o1.collection_date is not null AND o1.collection_date >= _createdDate)
OR (o1.collection_date is null AND o1.receipt_date is not null
AND o1.receipt_date >= _createdDate))
UNION
(SELECT o2.* FROM "sample"."order" o2 JOIN "sample".patient p2 ON o2.patient_id = p2.id
WHERE (o2.status = 1 or o2.status = 2 OR o2.status = 3) AND o2.sample_id != _sampleId
AND p2.mrn = _mrn
AND ((o2.collection_date is not null AND o2.collection_date < _createdDate)
OR (o2.collection_date is null AND o2.receipt_date is not null
AND o2.receipt_date < _createdDate)) ORDER by created_date DESC LIMIT 1)
ORDER by collection_date DESC, receipt_date DESC, created_date DESC;
Total time to get 199 previous order is 2628ms, and all most time taken for funciton findPreviousOrders: 2535ms
Analyze Query & Code Snippets !!!

Related

Optimize Query contains join and SubQuery

I need to run this query but it takes so long and I got timeout exception.
would you please help me how can I decrease the execution time of this query or how how can I make it simpler?
here is my Postgres Query:
select
AR1.patient_id,
CONCAT(Ac."firstName", ' ', Ac."lastName") as doctor_full_name,
to_json(Ac.expertise::json->0->'id')::text as expertise_id,
to_json(Ac.expertise::json->0->'title')::text as expertise_title,
AP."phoneNumbers" as mobile,
AC.account_id as account_id,
AC.city_id
from
tb1 as AR1
LEFT JOIN tb2 as AA
on AR1.appointment_id = AA.id
LEFT JOIN tb3 as AC
on AC.account_id = AA.appointment_owner_id
LEFT JOIN tb4 as AP
on AP.id = AR1.patient_id
where AR1.status = 'canceled'
and AR1.updated_at >= '2022-12-30 00:00:00'
and AR1.updated_at < '2022-12-30 23:59:59'
and AP."phoneNumbers" <> ''
and patient_id not in (
select
AR2.patient_id
from
tb1 as AR2
LEFT JOIN tb2 as AA2
on AR2.appointment_id = AA2.id
LEFT JOIN tb3 as AC2
on AC2.account_id = AA2.appointment_owner_id
where AR2.status = 'submited'
and AR2.created_at >= '2022-12-30 00:00:00'
and ( to_json(Ac2.expertise::json->0->'id')::text = to_json(Ac.expertise::json->0->'id')::text or ac2.account_id = ac.account_id )
)
Try creating an index on tb1 to handle the WHERE clauses you use in your outer query.
CREATE INDEX status_updated ON tb1
(status, updated_at, patient_id);
And, create this index to handle your inner query.
CREATE INDEX status_created ON tb1
(status, created_at, patient_id);
These work because the query planner can random access these BTREE indexes to the the first eligible row by status and date, and then sequentially scan the index until the last eligible row.
The comments about avoiding f(column) expressions in WHERE and ON conditions are correct. You want those conditions to be sargable whenever possible.
And, by the way, you want this for a datestamp range
and AR1.updated_at >= '2022-12-30 00:00:00'
and AR1.updated_at < '2022-12-31 00:00:00'
You have
and AR1.updated_at >= '2022-12-30 00:00:00'
and AR1.updated_at < '2022-12-30 23:59:59'
which excludes, rather than includes, rows from the last moment of 2022-12-30. In can be very hard to figure out what went wrong if you exclude a row improperly with a date-range off-by-one-error. (Ask how I know this sometime :-)

Convert query from MySQL to PostgreSQL: unlucky employees

I am trying to get this query for a database question( Unlucky wmployees) to Postgres. I am having a hard time getting it right
CREATE PROCEDURE unluckyEmployees()
BEGIN
SET #rn =0;
SELECT dep_name, emp_number, total_salary FROM
(SELECT dep_name, emp_number, total_salary, (#rn := #rn + 1) as seqnum FROM
(SELECT name AS dep_name, IF(e.id IS NULL, 0, COUNT(*)) AS emp_number, IFNULL(SUM(salary), 0) AS total_salary
FROM Department d LEFT JOIN Employee e ON e.Department = d.id
GROUP BY d.id HAVING COUNT(*) < 6 ORDER BY SUM(salary) DESC, COUNT(*) DESC, d.id) t )tt WHERE mod(seqnum, 2) = 1;
END
I am hoping to get the query in postgres. Tried setting #rn to row_number() but still not working.

How to append CTE results to main query output?

I've created a TSQL query that pulls from two sets of tables in my database. The tables in the Common Table Expression are different from the tables in the main query. I'm joining on MRN and need the end result to contain accounts from both sets of tables. I've written the following query to this end:
with cteHosp as(
select Distinct p.EncounterNumber, p.MRN, p.AdmitAge
from HospitalPatients p
inner join Eligibility e on p.MRN = e.MRN
inner join HospChgDtl c on p.pt_id = c.pt_id
inner join HospitalDiagnoses d on p.pt_id = d.pt_id
where p.AdmitAge >=12
and d.dx_cd in ('G89.4','R52.1','R52.2','Z00.129')
)
Select Distinct a.AccountNo, a.dob, DATEDIFF(yy, a.dob, GETDATE()) as Age
from RHCCPTDetail c
inner join RHCAppointments a on c.ClaimID = a.ClaimID
inner join Eligibility e on c.hl7Id = e.MRN
full outer join cteHosp on e.MRN = cteHosp.MRN
where DATEDIFF(yy, a.dob, getdate()) >= 12
and left(c.PriDiag,7) in ('G89.4','R52.1','R52.2', 'Z00.129')
or (
DATEDIFF(yy, a.dob, getdate()) >= 12
and LEFT(c.DiagCode2,7) in ('G89.4','R52.1','R52.2','Z00.129')
)
or (
DATEDIFF(yy, a.dob, getdate()) >= 12
and LEFT(c.DiagCode3,7) in ('G89.4','R52.1','R52.2','Z00.129')
)
or (
DATEDIFF(yy, a.dob, getdate()) >= 12
and LEFT(c.DiagCode4,7) in ('G89.4','R52.1','R52.2','Z00.129')
)
order by AccountNo
How do I merge together the output of both the common table expression and the main query into one set of results?
Merge performs inserts, updates or deletes. I believe you want to join the cte. If so, here is an example.
Notice the cteBatch is joined to the Main query below.
with
cteBatch (BatchID,BatchDate,Creator,LogID)
as
(
select
BatchID
,dateadd(day,right(BatchID,3) -1,
cast(cast(left(BatchID,4) as varchar(4))
+ '-01-01' as date)) BatchDate
,Creator
,LogID
from tblPriceMatrixBatch b
unpivot
(
LogID
for Logs in (LogIDISI,LogIDTG,LogIDWeb)
)u
)
Select
0 as isCurrent
,i.InterfaceID
,i.InterfaceName
,b.BatchID
,b.BatchDate
,case when isdate(l.start) = 0 and isdate(l.[end]) = 0 then 'Scheduled'
when isdate(l.start) = 1 and isdate(l.[end]) = 0 then 'Running'
when isdate(l.start) = 1 and isdate(l.[end]) = 1 and isnull(l.haserror,0) = 1 then 'Failed'
when isdate(l.start) = 1 and isdate(l.[end]) = 1 and isnull(l.haserror,0) != 1 then 'Success'
else 'idunno' end as stat
,l.Start as StartTime
,l.[end] as CompleteTime
,b.Creator as Usr
from EOCSupport.dbo.Interfaces i
join EOCSupport.dbo.Logs l
on i.InterfaceID = l.InterfaceID
join cteBatch b
on b.logid = l.LogID

count using subqueries in T-sql

The following is my query. I need to get the count of the doctor's visits for each patient in the query. The count isn't right and it's printing 2 rows for each patient.
SELECT
pf.PatientId
, p.Visit
, pf.first
, pf.last
, df.first
, df.last
, doc.reconcile_status
, doc.orderid
, count(p.visit)
FROM [CentricityPS].[dbo].[PatientVisit] p
, [CentricityPS].[dbo].[document] doc
, [CentricityPS].[dbo].[Patientprofile] pf
, [CentricityPS].[dbo].[doctorfacility] df
where df.pvid in ('1507023132004420', '1725527248154950', '1406648461000690')
and p.doctorid = df.DoctorFacilityId
and p.patientprofileid = pf.patientprofileid
and pf.pid = doc.pid
and pf.patientstatusmid = '-900'
and pf.PatientProfileId = p.PatientProfileId
-- and pf.PatientId = '8145'
-- and p.visit >= '2016-01-01' and p.visit <= '2016-07-01'
and not exists (select * from [CentricityPS].[dbo].[PatientVisit] p
where (p.visit > '2013-01-01' and p.visit < = '2016-01-01')
and p.patientprofileid = pf.patientprofileid and pf.patientstatusmid not in (-901) )
and not exists (select * from [CentricityPS].[dbo].[PatientVisit] p
where p.visit <= '2013-01-01'
and p.patientprofileid = pf.patientprofileid and pf.patientstatusmid not in (-901) )
-- and pf.patientid = '100293'
group by df.DoctorFacilityId, pf.PatientId, p.visit, pf.first, pf.last, df.first, df.last, doc.RECONCILE_STATUS, doc.ORDERID, p.PatientProfileId
order by df.doctorfacilityid, pf.patientid, p.visit desc
What am I doing wrong?
Help!!!
You're grouping on too many fields. If you just need the count of doctor's visits for each patient, only SELECT PatientProfile fields along with count(p.visit) and just include the same PatientProfile fields in the GROUP BY.

Replace Subselect for something more efficient

I have this query which takes a long time, partly because the number of records in the table excedd 500 000 records, but the join I have to use slows it down quite a lot, at least to my beliefs
SELECT TOP (10) PERCENT H1.DateCompteur, CASE WHEN (h1.cSortie - h2.cSortie > 0)
THEN h1.cSortie - h2.cSortie ELSE 0 END AS Compte, H1.IdMachine
FROM dbo.T_HistoriqueCompteur AS H1 INNER JOIN
dbo.T_HistoriqueCompteur AS H2 ON H1.IdMachine = H2.IdMachine AND H2.DateCompteur =
(SELECT MAX(DateCompteur) AS Expr1
FROM dbo.T_HistoriqueCompteur AS HS
WHERE (DateCompteur < H1.DateCompteur) AND (H1.IdMachine = IdMachine))
ORDER BY H1.DateCompteur DESC
The order by is important since I need only the most recent informations. I tried using the ID field in my sub select since they are ordred by date but could not detect any significant improvement.
SELECT TOP (10) PERCENT H1.DateCompteur, CASE WHEN (h1.cSortie - h2.cSortie > 0)
THEN h1.cSortie - h2.cSortie ELSE 0 END AS Compte, H1.IdMachine
FROM dbo.T_HistoriqueCompteur AS H1 INNER JOIN
dbo.T_HistoriqueCompteur AS H2 ON H1.IdMachine = H2.IdMachine AND H2.ID =
(SELECT MAX(ID) AS Expr1
FROM dbo.T_HistoriqueCompteur AS HS
WHERE (ID < H1.ID) AND (H1.IdMachine = IdMachine))
ORDER BY H1.DateCompteur DESC
the table I use look a little like this (I got much more columns but they are unused in this query).
ID bigint
IdMachine bigint
cSortie bigint
DateCompteur datetime
I think that if I could get rid of the sub select, my query would run much faster but I can't really find a way to do so. What I really want to do is to find the previous row with the same IdMachine so that I can calculate the difference between the two cSortie values. The case in the query is because something it's reseted to 0 and in this case, I want to return 0 instead of a negative value.
So my question is this : Can I do better than what I already have ??? I plan to put this in a view if that makes a difference.
Try this query
WITH T as
(
SELECT TOP (10) PERCENT H1.DateCompteur, H1.cSortie as cSortie1, H1.IdMachine,
(
SELECT TOP 1 H2.cSortie
FROM dbo.T_HistoriqueCompteur H2
WHERE (H2.DateCompteur < H1.DateCompteur) AND (H1.IdMachine = H2.IdMachine)
ORDER BY H2.DateCompteur DESC
) as cSortie2
FROM dbo.T_HistoriqueCompteur AS H1
ORDER BY H1.DateCompteur DESC
)
select DateCompteur,
CASE WHEN (cSortie1 - cSortie2 > 0)
THEN cSortie1 - cSortie2
ELSE 0 END
AS Compte,
IdMachine
FROM T
You could also try CTE's (common table expressions) with windowing functions (ROW_NUMBER):
;WITH CTE AS
(
SELECT ID,IdMachine,cSortie,ROW_NUMBER() OVER(PARTITION BY h.IdMachine ORDER BY ID ASC) AS [ROW]
FROM T_HistoriqueCompteur h
)
SELECT
TOP (10) PERCENT
H1.DateCompteur,
CASE WHEN (h1.cSortie - h2.cSortie > 0) THEN h1.cSortie - h2.cSortie
ELSE 0
END AS Compte,
H1.IdMachine
FROM dbo.T_HistoriqueCompteur AS H1
INNER JOIN CTE cte on cte.idmachine = h1.idmachine and cte.id = h1.id
INNER JOIN CTE h2 on h2.idmachine = cte.idmachine and h2.row + 1 = cte.row
ORDER BY H1.DateCompteur DESC