N1QL query dropping records after join with a subquery

N1QL query dropping records after join with a subquery - nosql

The Below Query is dropping records when i join 2 N1QL sub queries -
We are using couchbase and using N1QL queries.
Full Query -
select
t3.appName,
t3.uuid_proj as uuid,
t3.description,
t3.env,
t3.productStatus
from
( select
t1.uuid as uuid_proj ,
t1.appName as appName ,
t1.description as description,
t2.env as env,
t2.productStatus as productStatus
from
(
select
api_external.uuid ,
api_external.data.appName ,
api_external.data.description
from `api_external`
where type = 'partnerApp'
and data.companyId = '70a149da27cc425da86cba890bf5b143' )t1
join
(
select
api_external.data.env,
api_external.data.productStatus,
api_external.data.partnerAppId
from
`api_external`
where type = 'integration' )t2
on t1.uuid = t2.partnerAppId
) as t3
join (
select t4.uuid as uuid_agg , min(t5.env) as env
from
(select api_external.uuid from `api_external` where type = 'partnerApp' and data.companyId = '70a149da27cc425da86cba890bf5b143' )as t4 join
(select api_external.data.env, api_external.data.partnerAppId from `api_external` where type = 'integration' ) as t5
on t4.uuid = t5.partnerAppId
group by t4.uuid
) as t6
on
t3.uuid_proj = t6.uuid_agg and t3.env = t6.env
As you see it has 2 sub queries -
The below subquery gives 16 records -
select
t1.uuid as uuid_proj
from
(
select
api_external.uuid ,
api_external.data.appName ,
api_external.data.description
from `api_external`
where type = 'partnerApp'
and data.companyId = '70a149da27cc425da86cba890bf5b143' )t1
join
(
select
api_external.data.env,
api_external.data.productStatus,
api_external.data.partnerAppId
from
`api_external`
where type = 'integration' )t2
on t1.uuid = t2.partnerAppId
group by t1.uuid
Also the other subquery also gives 16 records -
select t4.uuid as uuid_agg , min(t5.env) as env
from
(select api_external.uuid from `api_external` where type = 'partnerApp' and data.companyId = '70a149da27cc425da86cba890bf5b143' )as t4 join
(select api_external.data.env, api_external.data.partnerAppId from `api_external` where type = 'integration' ) as t5
on t4.uuid = t5.partnerAppId
group by t4.uuid
By Logic join of both the queries on the same grain UUID must also give 16 records . But it gives only 1 .
What am i doing wrong Please help

The query uses many subqueries and hit the issue.
Try following simplified version
CREATE INDEX ix1 ON api_external(data.companyId, uuid, data.appName, data.description) WHERE type = "partnerApp";
CREATE INDEX ix2 ON api_external(data.partnerAppId, data.env, data.productStatus) WHERE type = "integration";
WITH ct3 AS (SELECT t1.uuid, t1.data.appName, t1.data.description,
t2.data.env, t2.data.productStatus
FROM api_external AS t1
JOIN api_external AS t2 ON t1.uuid = t2.data.partnerAppId
WHERE t1.type = "partnerApp"
AND t1.data.companyId = "70a149da27cc425da86cba890bf5b143"
AND t2.type = "integration"
AND t2.data.partnerAppId IS NOT NULL),
ct6 AS ( SELECT t4.uuid AS uuid_agg , MIN(t5.data.env) AS env
FROM api_external AS t4
JOIN api_external AS t5 ON t4.uuid = t5.data.partnerAppId
WHERE t4.type = "partnerApp"
AND t4.data.companyId = "70a149da27cc425da86cba890bf5b143"
AND t5.type = "integration"
AND t5.data.partnerAppId IS NOT NULL
GROUP BY t4.uuid)
SELECT t3.*
FROM ct3 AS t3
JOIN ct6 AS t6 ON t3.uuid = t6.uuid_agg and t3.env = t6.env;
If same results see following works . After JOIN get all the fields of results of MIN env record each group
SELECT m[1].*
FROM api_external AS t4
JOIN api_external AS t5 ON t4.uuid = t5.data.partnerAppId
WHERE t4.type = "partnerApp"
AND t4.data.companyId = '70a149da27cc425da86cba890bf5b143'
AND t5.type = "integration"
AND t5.data.partnerAppId IS NOT NULL
GROUP BY t4.uuid
LETTING m = MIN([t5.data.env, {t4.uuid, t4.data.appName, t4.data.description,
t5.data.env, t5.data.productStatus}]);

Related

Prisma ORM Generated SQL Performance

I have a query that I've written in Prisma. It basically connects 7 tables, each of which only has maybe 2-4 rows. Yes, 2-4 rows, that's it.
The query that Prisma generates can take upwards of 2-3 minutes to run.
When I write the query directly in SQL, it runs in 1ms.
So, the query I write looks like :
SELECT *
FROM rubricCellAssessment
INNER JOIN rubricAssessment ON rubricCellAssessment.rubricAssessmentId = rubricAssessment.id
INNER JOIN rubricCell ON rubricCellAssessment.rubricCellId = rubricCell.id
INNER JOIN criteriaDescriptor ON rubricCell.criteriaDescriptorId = criteriaDescriptor.id
INNER JOIN criteria ON criteriaDescriptor.criteriaId = criteria.id
INNER JOIN skill ON criteria.skillId = skill.id
INNER JOIN user AS student ON rubricAssessment.studentId = student.id
INNER JOIN user AS assessor ON rubricAssessment.assessorId = assessor.id
LEFT JOIN studentContentBase ON rubricAssessment.studentContentBaseId = studentContentBase.id
WHERE student.userTypeId = 1
AND assessor.userTypeId IN (2, 1000, 3)
AND skill.id = 'bd47bfff-760c-4fec-86d2-1063bbd89d63'
AND criteriaDescriptor.criteriaId = criteria.id
AND rubricCell.criteriaDescriptorId = criteriaDescriptor.id;
The query prisma produces looks like this:
select
`lp-mvp`.`RubricCellAssessment`.`id`,
`lp-mvp`.`RubricCellAssessment`.`createdAt`,
`lp-mvp`.`RubricCellAssessment`.`updatedAt`,
`lp-mvp`.`RubricCellAssessment`.`assessmentValue`,
`lp-mvp`.`RubricCellAssessment`.`rubricCellId`,
`lp-mvp`.`RubricCellAssessment`.`rubricAssessmentId`
from
`lp-mvp`.`RubricCellAssessment`
where
((`lp-mvp`.`RubricCellAssessment`.`id`) in (
select
`t0`.`id`
from
`lp-mvp`.`RubricCellAssessment` as `t0`
inner join `lp-mvp`.`RubricAssessment` as `j0` on
(`j0`.`id`) = (`t0`.`rubricAssessmentId`)
where
((`j0`.`id`) in (
select
`t1`.`id`
from
`lp-mvp`.`RubricAssessment` as `t1`
inner join `lp-mvp`.`User` as `j1` on
(`j1`.`id`) = (`t1`.`studentId`)
where
(`j1`.`userTypeId` = 1
and `t1`.`id` is not null))
and `t0`.`id` is not null))
and (`lp-mvp`.`RubricCellAssessment`.`id`) in (
select
`t0`.`id`
from
`lp-mvp`.`RubricCellAssessment` as `t0`
inner join `lp-mvp`.`RubricCell` as `j0` on
(`j0`.`id`) = (`t0`.`rubricCellId`)
where
((`j0`.`id`) in (
select
`t1`.`id`
from
`lp-mvp`.`RubricCell` as `t1`
inner join `lp-mvp`.`CriteriaDescriptor` as `j1` on
(`j1`.`id`) = (`t1`.`criteriaDescriptorId`)
where
((`j1`.`id`) in (
select
`t2`.`id`
from
`lp-mvp`.`CriteriaDescriptor` as `t2`
inner join `lp-mvp`.`Criteria` as `j2` on
(`j2`.`id`) = (`t2`.`criteriaId`)
where
((`j2`.`id`) in (
select
`t3`.`id`
from
`lp-mvp`.`Criteria` as `t3`
inner join `lp-mvp`.`Skill` as `j3` on
(`j3`.`id`) = (`t3`.`skillId`)
where
(`j3`.`id` = "bd47bfff-760c-4fec-86d2-1063bbd89d63"
and `t3`.`id` is not null))
and `t2`.`id` is not null))
and `t1`.`id` is not null))
and `t0`.`id` is not null))
and (`lp-mvp`.`RubricCellAssessment`.`id`) in (
select
`t0`.`id`
from
`lp-mvp`.`RubricCellAssessment` as `t0`
inner join `lp-mvp`.`RubricAssessment` as `j0` on
(`j0`.`id`) = (`t0`.`rubricAssessmentId`)
where
((`j0`.`id`) in (
select
`t1`.`id`
from
`lp-mvp`.`RubricAssessment` as `t1`
inner join `lp-mvp`.`User` as `j1` on
(`j1`.`id`) = (`t1`.`assessorId`)
where
(`j1`.`userTypeId` in (2,1000,3)
and `t1`.`id` is not null))
and `t0`.`id` is not null))
and (`lp-mvp`.`RubricCellAssessment`.`id`) in (
select
`t0`.`id`
from
`lp-mvp`.`RubricCellAssessment` as `t0`
inner join `lp-mvp`.`RubricCell` as `j0` on
(`j0`.`id`) = (`t0`.`rubricCellId`)
where
((`j0`.`id`) in (
select
`t1`.`id`
from
`lp-mvp`.`RubricCell` as `t1`
inner join `lp-mvp`.`CriteriaDescriptor` as `j1` on
(`j1`.`id`) = (`t1`.`criteriaDescriptorId`)
where
((`j1`.`id`) in (
select
`t2`.`id`
from
`lp-mvp`.`CriteriaDescriptor` as `t2`
inner join `lp-mvp`.`Criteria` as `j2` on
(`j2`.`id`) = (`t2`.`criteriaId`)
where
((`j2`.`id`) in (
select
`t3`.`id`
from
`lp-mvp`.`Criteria` as `t3`
inner join `lp-mvp`.`Skill` as `j3` on
(`j3`.`id`) = (`t3`.`skillId`)
where
(`j3`.`id` = "bd47bfff-760c-4fec-86d2-1063bbd89d63"
and `t3`.`id` is not null))
and `t2`.`id` is not null))
and `t1`.`id` is not null))
and `t0`.`id` is not null)))
Is there any way to optimize this (other than use RAWsql?)? I'm surprised Prisma's code generation is this bad, but I'd rather not move away from it at this point if possible.
All help gratefully accepted.

View query without sub selecting T-SQL

so I'm trying to build a view query but I keep failing using only joins so I ended up with this deformation.. Any tips on how I can write this query so I don't have to use 6 subselects?
The FeeSum and PaymentSum can be null, so ideally I do not want those in my result set and I also wouldn't like results where the FeeSum and the PaymentSum are equal.
Quick note: client is the table where the clients informations are stored (name, adress, etc..)
customer has a fk on client and is kind of a shell table for the client that store more information for the client,
payment is a list of all payments a customer did,
order is a list of all orders a customer did.
The goal is to get a list where we can track which customer has open fees to pay, based on the orders. It's a legacy project so don't ask why people can order before paying :)
SELECT
cu.Id as [CustomerId]
, CASE
WHEN cl.IsPerson = 1
THEN cl.[AdditionalName] + ' ' + cl.[Name]
ELSE cl.AdditionalName
END as [Name]
, cl.CustomerNumber
, (SELECT SUM(o.Fee) FROM [publication].[Order] o WHERE o.[State] = 2 AND o.CustomerId = cu.Id) as [FeeSum]
, (SELECT SUM(p.Amount) FROM [publication].[Payment] p WHERE p.CustomerId = cu.Id) as [PaymentSum]
, (SELECT MAX(o.OrderDate) FROM [publication].[Order] o WHERE o.[State] = 2 AND o.CustomerId = cu.Id) as [LastOrderDate]
, (SELECT MAX(p.PaymentDate) FROM [publication].[Payment] p WHERE p.CustomerId = cu.Id) as [LastPaymentDate]
, (SELECT MAX(f.Created) FROM [client].[File] f WHERE f.TemplateName = 'Reminder' AND f.ClientId = cl.Id) as [LastReminderDate]
, (SELECT MAX(f.Created) FROM [client].[File] f WHERE f.TemplateName = 'Warning' AND f.ClientId = cl.Id) as [LastWarningDate]
FROM
[publication].[Customer] cu
JOIN
[client].[Client] cl
ON cl.Id = cu.ClientId
WHERE
cu.[Type] = 0
Thanks in advance and I hope I didn't do anything wrong.
Kind regards

You could rewrite the correlated subqueries to instead use joins:
SELECT
cu.Id AS [CustomerId],
CASE WHEN cl.IsPerson = 1
THEN cl.[AdditionalName] + ' ' + cl.[Name]
ELSE cl.AdditionalName END AS [Name],
cl.CustomerNumber,
o.FeeSum,
p.PaymentSum,
o.LastOrderDate,
p.LastPaymentDate,
f.LastReminderDate,
f.LastWarningDate
FROM [publication].[Customer] cu
INNER JOIN [client].[Client] cl
ON cl.Id = cu.ClientId
INNER JOIN
(
SELECT CustomerId, SUM(Fee) AS [FeeSum], MAX(OrderDate) AS [LastOrderDate]
FROM [publication].[Order]
WHERE o.[State] = 2
GROUP BY CustomerId
) o
ON o.CustomerId = cu.Id
INNER JOIN
(
SELECT CustomerId, SUM(Amount) AS [PaymentSum], MAX(PaymentDate) AS [LastPaymentDate]
FROM [publication].[Payment]
WHERE o.[State] = 2
GROUP BY CustomerId
) p
ON p.CustomerId = cu.Id
INNER JOIN
(
SELECT ClientId,
MAX(CASE WHEN TemplateName = 'Reminder' THEN Created END) AS [LastReminderDate],
MAX(CASE WHEN TemplateName = 'Warning' THEN Created END) AS [LastWarningDate]
FROM [client].[File]
GROUP BY ClientId
) f
ON f.ClientId = cl.Id
WHERE
cu.[Type] = 0;

Avoiding Order By in T-SQL

Below sample query is a part of my main query. I found SORT operator in below query is consuming 30% of the cost.
To avoid SORT, there is need of creation of Indexes. Is there any other way to optimize this code.
SELECT TOP 1 CONVERT( DATE, T_Date) AS T_Date
FROM TableA
WHERE ID = r.ID
AND Status = 3
AND TableA_ID >ISNULL((
SELECT TOP 1 TableA_ID
FROM TableA
WHERE ID = r.ID
AND Status <> 3
ORDER BY T_Date DESC
), 0)
ORDER BY T_Date ASC

Looks like you can use not exists rather than the sorts. I think you'll probably get a better performance boost by use a CTE or derived table instead of the a scalar subquery.
select *
from r ... left outer join
(
select ID, min(t_date) as min_date from TableA t1
where status = 3 and not exists (
select 1 from TableA t2
where t2.ID = t1.ID
and t2.status <> 3 and t2.t_date > t1.t_date
)
group by ID
) as md on md.ID = r.ID ...
or
select *
from r ... left outer join
(
select t1.ID, min(t1.t_date) as min_date
from TableA t1 left outer join TableA t2
on t2.ID = t1.ID and t2.status <> 3
where t1.status = 3 and t1.t_date < t2.t_date
group by t1.ID
having count(t2.ID) = 0
) as md on md.ID = r.ID ...
It also appears that you're relying on an identity column but it's not clear what those values mean. I'm basically ignoring it and using the date column instead.

Try this:
SELECT TOP 1 CONVERT( DATE, T_Date) AS T_Date
FROM TableA a1
LEFT JOIN (
SELECT ID, MAX(TableA_ID) AS MaxAID
FROM TableA
WHERE Status <> 3
GROUP BY ID
) a2 ON a2.ID = a1.ID AND a1.TableA_ID > coalesce(a2.MAXAID,0)
WHERE a1.ID = r.ID AND a1.Status = 3
ORDER BY T_Date ASC
The use of TOP 1 in combination with the unexplained r alias concern me. There's almost certainly a MUCH better way to get this data into your results that doesn't involve doing this in a sub query (unless this is for an APPLY operation).

How do I get the max from a partitioned query using Rank ()

I want the result to return the max Rank when partitioned using the rank function.
I am using the following query.
SELECT DISTINCT dbo.pomst.co_num
,dbo.pomst.wh_num
,dbo.pomst.po_number
,dbo.pomst.po_suffix
,dbo.pomst.vendor_id
,dbo.item.uom
,dbo.item.upc_num
,dbo.item.item_desc
,RIGHT(dbo.auditlog.pallet_id, 8) AS pallet_id
,dbo.auditlog.abs_num
,dbo.auditlog.item_qty
,dbo.auditlog.lot
,dbo.auditlog.packer
,auditlog.comments
,auditlog.date_time
,rank() OVER (
PARTITION BY auditlog.comments ORDER BY auditlog.date_time ASC
) AS CorrectTrans
FROM dbo.auditlog
INNER JOIN dbo.pomst ON dbo.auditlog.co_num = dbo.pomst.co_num
AND dbo.auditlog.wh_num = dbo.pomst.wh_num
AND dbo.auditlog.po_number = dbo.pomst.po_number
AND dbo.auditlog.po_suffix = dbo.pomst.po_suffix
INNER JOIN dbo.item ON dbo.auditlog.co_num = dbo.item.co_num
AND dbo.auditlog.wh_num = dbo.item.wh_num
AND dbo.auditlog.abs_num = dbo.item.abs_num
WHERE (dbo.pomst.co_num = 'AC01')
AND (dbo.pomst.wh_num = 'KU22')
AND (dbo.pomst.row_status = 'C')
AND (dbo.auditlog.trans_type = 're')
AND item_qty NOT LIKE '-%'

I figured it out! I was trying to get the max result of a rank, but if I flip order of rank from asc to desc and use a CTE I can select the the results that always have 1 as the rank as opposed to trying to get the Max. I would still like to know how to get the Max rank but this solution suits my needs.
;with cte as
(SELECT DISTINCT dbo.pomst.co_num
,dbo.pomst.wh_num
,dbo.pomst.po_number
,dbo.pomst.po_suffix
,dbo.pomst.vendor_id
,dbo.item.uom
,dbo.item.upc_num
,dbo.item.item_desc
,RIGHT(dbo.auditlog.pallet_id, 8) AS pallet_id
,dbo.auditlog.abs_num
,dbo.auditlog.item_qty
,dbo.auditlog.lot
,dbo.auditlog.packer
,auditlog.comments
,auditlog.date_time
,rank() OVER (
PARTITION BY auditlog.comments ORDER BY auditlog.date_time desc
) AS CorrectTrans
FROM dbo.auditlog
INNER JOIN dbo.pomst ON dbo.auditlog.co_num = dbo.pomst.co_num
AND dbo.auditlog.wh_num = dbo.pomst.wh_num
AND dbo.auditlog.po_number = dbo.pomst.po_number
AND dbo.auditlog.po_suffix = dbo.pomst.po_suffix
INNER JOIN dbo.item ON dbo.auditlog.co_num = dbo.item.co_num
AND dbo.auditlog.wh_num = dbo.item.wh_num
AND dbo.auditlog.abs_num = dbo.item.abs_num
WHERE (dbo.pomst.co_num = 'AC01')
AND (dbo.pomst.wh_num = 'KU22')
AND (dbo.pomst.row_status = 'C')
AND (dbo.auditlog.trans_type = 're')
AND item_qty NOT LIKE '-%'
)
Select * from cte
where CorrectTrans = 1

Add select and group by and use your existing query as a sub query.
Try ..
select max([CorrectTrans]), Vendor_Id, Item_qty, Lot, Pallet_id
from (
-- Your existing query --
SELECT DISTINCT dbo.pomst.co_num
,dbo.pomst.wh_num
,dbo.pomst.po_number
,dbo.pomst.po_suffix
,dbo.pomst.vendor_id
,dbo.item.uom
,dbo.item.upc_num
,dbo.item.item_desc
,RIGHT(dbo.auditlog.pallet_id, 8) AS pallet_id
,dbo.auditlog.abs_num
,dbo.auditlog.item_qty
,dbo.auditlog.lot
,dbo.auditlog.packer
,auditlog.comments
,auditlog.date_time
,rank() OVER (
PARTITION BY auditlog.comments ORDER BY auditlog.date_time ASC
) AS CorrectTrans
FROM dbo.auditlog
INNER JOIN dbo.pomst ON dbo.auditlog.co_num = dbo.pomst.co_num
AND dbo.auditlog.wh_num = dbo.pomst.wh_num
AND dbo.auditlog.po_number = dbo.pomst.po_number
AND dbo.auditlog.po_suffix = dbo.pomst.po_suffix
INNER JOIN dbo.item ON dbo.auditlog.co_num = dbo.item.co_num
AND dbo.auditlog.wh_num = dbo.item.wh_num
AND dbo.auditlog.abs_num = dbo.item.abs_num
WHERE (dbo.pomst.co_num = 'AC01')
AND (dbo.pomst.wh_num = 'KU22')
AND (dbo.pomst.row_status = 'C')
AND (dbo.auditlog.trans_type = 're')
AND item_qty NOT LIKE '-%'
-- =======================================
) x
group by Vendor_id, Item_qty, Lot, Pallet_id

The multi-part identifier "..." could not be bound

I get error (The multi-part identifier "f.FormID" could not be bound.) running this query:
select f.FormID, f.Title, fv.UserName
from Forms f join (
SELECT FormID
FROM Reports
WHERE (ReportID = #ReportID)
UNION
SELECT FormRelations.ForigenFormID
FROM FormRelations INNER JOIN
Forms ON FormRelations.ForigenFormID = Forms.FormID
WHERE (FormRelations.PrimaryFormID =
(SELECT FormID
FROM Reports
WHERE (ReportID = #ReportID)))
) ids
on f.FormID = ids.FormID
LEFT OUTER JOIN (select top 1 UserName, FormID from FormValues where FormID = f.FormID and UserName = #UserName) fv
ON f.FormID = fv.FormID
Please someone help me :(
#bluefeet:
I want such a result:
01304636-FABE-4A3E-9487-A14B012F9A61 item_1 1234567890
C0455E97-788A-4305-876A-A15000CFE928 item_2 1234567890
7719F37E-7021-4ABD-91ED-A15301830324 item_3 1234567890

If you need to use your alias inside of your subquery like that, you might want to look at using the APPLY operator:
select f.FormID, f.Title, fv.UserName
from Forms f
join
(
SELECT FormID
FROM Reports
WHERE (ReportID = #ReportID)
UNION
SELECT FormRelations.ForigenFormID
FROM FormRelations
INNER JOIN Forms
ON FormRelations.ForigenFormID = Forms.FormID
WHERE (FormRelations.PrimaryFormID = (SELECT FormID
FROM Reports
WHERE (ReportID = #ReportID)))
) ids
on f.FormID = ids.FormID
CROSS APPLY
(
select top 1 UserName, FormID
from FormValues
where FormID = f.FormID
and UserName = #UserName
) fv
Or you can use row_number():
select f.FormID, f.Title, fv.UserName
from Forms f
join
(
SELECT FormID
FROM Reports
WHERE (ReportID = #ReportID)
UNION
SELECT FormRelations.ForigenFormID
FROM FormRelations
INNER JOIN Forms
ON FormRelations.ForigenFormID = Forms.FormID
WHERE (FormRelations.PrimaryFormID = (SELECT FormID
FROM Reports
WHERE (ReportID = #ReportID)))
) ids
on f.FormID = ids.FormID
LEFT JOIN
(
select UserName, FormID,
ROW_NUMBER() over(PARTITION by FormID, UserName order by FormID) rn
from FormValues
where UserName = #UserName
) fv
on f.FormID = fv.FormID
and fv.rn = 1