postgresql inner join duplicating some records - postgresql

I have a large query developed as cte, in certain parts I have to make totals of secondary tables using inner joins to minimize the number of records processed, somehow two subqueries almost identical one works and the second duplicates 8 times some of the totalized records
I need to use inner join or the response time is shoots to the sky by 15x or more times
with
p0 as (select distinct on (pventa) pventa, p.tipo tpva from lecturas l
left join puntoventa p on l.pventa=p.numero where dia between '2017-10-01' and '2017-10-31' and p.tipo in ('A','E')),
r1 as (select p.tpva, l.pventa, dia, turno from lecturas l
inner join p0 p on p.pventa=l.pventa
where dia between '2017-10-01' and '2017-10-31'),
p1 as (select pva, remision, sum(abono), count(abono) from pagosremisiones p
inner join movsgas m on p.pva=m.pventa and p.remision=m.folio
inner join r1 r on r.pventa=m.pventa and r.dia=m.dia and r.turno=m.turno group by 1,2 order by 1,2 ),
f1 as (select c.serie, c.factura, sum(abono), count(abono) from chequefactura c
inner join movsgas m on c.serie=m.serie and c.factura=m.factura
inner join r1 r on r.pventa=m.pventa and r.dia=m.dia and r.turno=m.turno group by 1,2 order by 1,2 )
select * from p1
Nprem and ncheck are for debugging
P1 and f1 depend on r1, p1 works (as far as I've tried) without duplicate records (nprem corresponds to existing registers), however, ncheck increases on some records up to 8 times its actual values
I'm not sure if the correct p1's results are purely casual and don't know how to correct duplicates in f1
I do have the alternative of doing direct subqueries but I have a didactic interest in using joins
Btw, so far direct subqueries are much more efficient than the joins possibly because they have been poorly structured
What am I doing wrong?
What would you do to optimize the code?
Thanks in advance
Jose

the trick needed is the new subquery r2 including [ distinct on (serie, factura) ], if I omit it the error persists; duplicates in r2 do not correspond to the number of duplicates in f1, so I had no idea where so many came from; thank you all and again an apology for the terrible description of my problem
with
p0 as (select distinct on (pventa) pventa, p.tipo tpva from lecturas l
left join puntoventa p on l.pventa=p.numero where dia between '2017-10-01' and '2017-10-31' and p.tipo in ('A','E')),
r1 as (select p.tpva, l.pventa, dia, turno from lecturas l
inner join p0 p on p.pventa=l.pventa
where dia between '2017-10-01' and '2017-10-31'),
r2 as (select distinct on (serie, factura) m.serie,m.factura from movsgas m
inner join chequefactura c on c.serie=m.serie and c.factura=m.factura
inner join r1 r on r.pventa=m.pventa and r.dia=m.dia and r.turno=m.turno),
p1 as (select pva, remision, sum(abono) payp from pagosremisiones p
inner join movsgas m on p.pva=m.pventa and p.remision=m.folio
inner join r1 r on r.pventa=m.pventa and r.dia=m.dia and r.turno=m.turno group by 1,2 order by 1,2 ),
f1 as (select c.serie, c.factura, sum(abono) payfr2, count(*) from chequefactura c
inner join r2 r on r.serie=c.serie and r.factura=c.factura group by 1,2 order by 1,2 )

Related

groupingBy a join in KTORM

I want to join a N x M relation, returning all N, and a count of how many M are there:
SELECT N.*, COUNT(M.id) FROM N LEFT JOIN M ON N.id = M.n_id GROUP BY N.id
but I don't see how. Joining and aggregation seem to exclude each other, or are not implemented, or not documented (or I didn't find it).

POSTGRESQL : Combining two query result with different columns but same number of rows

Im currently trying to combine 2 queries results in one.
These 2 queries has the same number of rows and are group by the same field but has different column.
This works :
SELECT Distinct
MAX(d.libelle) AS libelle_dpt,
MAX(d.code_dpt) AS code_dep,
MAX(r.libelle) AS libelle_region,
MAX(pd.pop_dep) AS nb_habitants,MAX(ls.nb_canton) AS nb_canton
FROM election_2015.commune c
LEFT JOIN (
SELECT MAX(c.code_canton) AS code_canton,Count(distinct c.code_canton) AS nb_canton
FROM election_2015.commune co
JOIN election_2015.departement d
ON co.code_dpt = d.code_dpt
JOIN election_2015.canton c
ON c.code_canton = co.code_canton
GROUP BY d.code_dpt
ORDER BY d.code_dpt ASC
) ls
ON ls.code_canton = c.code_canton
JOIN election_2015.departement d
ON d.code_dpt = c.code_dpt
JOIN election_2015.region r
ON d.code_region = r.code_region
JOIN election.popgent_all pd
ON pd.dep = d.code_dpt
GROUP BY d.code_dpt
But I was wondering if there is an other way to do this, maybe like an union but with rows?
Something like this (not working cause queries hasn't the same number of columns) :
SELECT Distinct
MAX(d.libelle) AS libelle_dpt,
MAX(d.id) AS id_dep,MAX(d.code_dpt) AS code_dep,
MAX(r.libelle) AS libelle_region,
MAX(pd.pop_dep) AS nb_habitants
FROM election_2015.commune c
LEFT JOIN election_2015.departement d
ON d.code_dpt = c.code_dpt
LEFT JOIN election_2015.region r
ON d.code_region = r.code_region
LEFT JOIN election.popgent_all pd
ON pd.dep = d.code_dpt
GROUP BY d.code_dpt
UNION
SELECT Count(distinct c.code_canton) AS nb_canton FROM election_2015.commune co
JOIN election_2015.departement d
ON co.code_dpt = d.code_dpt
JOIN election_2015.canton c
ON c.code_canton = co.code_canton
GROUP BY d.code_dpt
ORDER BY election_2015.departement.code_dpt ASC
Thanks for any help.
Alexandre

Rewrite Query with Cartesian product

I am not a developer, so need help in rewriting this Query without a inner join as i think that is the core issue. This runs over 20 seconds. Smaller chunks run in under a second. Pl help.
select a.compID, b.InitialRT, b.VwRgts, b.UpdRgts, b.InsRgts, b.delRgts, b.Sscrnum , c.UserID
from tablecmpy a, tbldetrght b (nolock)
inner join tableuser c (nolock) on c.GroupID = b.UserId
where b.RecType='G'
and b.compID='[ALL]'
and b.InitialRT+b.VwRgts+b.UpdRgts+b.InsRgts+b.delRgts > 0
You already got Cartesian join (old style joins) here from tablecmpy a, tbldetrght b (nolock). Try to use:
SELECT a.compID,
b.InitialRT,
b.VwRgts,
b.UpdRgts,
b.InsRgts,
b.delRgts,
b.Sscrnum,
c.UserID
FROM tablecmpy a (nolock)
CROSS JOIN tbldetrght b (nolock)
INNER JOIN tableuser c (nolock)
ON c.GroupID = b.UserId
WHERE b.RecType='G'
AND b.compID='[ALL]'
AND COALESCE(b.InitialRT,0) +
COALESCE(b.VwRgts,0) +
COALESCE(b.UpdRgts,0) +
COALESCE(b.InsRgts,0) +
COALESCE(b.delRgts,0) > 0
I don't know if you have NULL in b.columns, so I add COALESCE to handle them.

how to solve this complicated sql query

these are the five given tables
http://i58.tinypic.com/53wcxe.jpg
this is the recomanded result
http://i58.tinypic.com/2vsrts7.jpg
please help how can i write a query to have this result.
no idea how!!!!
SELECT K.* , COUNT (A.Au_ID) AS AnzahlAuftr
FROM Kunde K
LEFT JOIN Auftrag A ON K.Kd_ID = A.Au_Kd_ID
GROUP BY K.Kd_ID,K.Kd_Firma,K.Kd_Strasse,K.Kd_PLZ,K.Kd_Ort
ORDER BY K.Kd_PLZ DESC;
SELECT COUNT (F.F_ID) AS AnzahlFahrt
FROM Fahrten F
RIGHT JOIN Auftrag A ON A.Au_ID = F.F_Au_ID
SELECT SUM (T.Ts_Strecke) AS SumStrecke
FROM Teilstrecke T
LEFT JOIN Fahrten F ON F.F_ID = T.Ts_F_ID
how to join these 3 in one?
Grouping on Strasse etc. is not necessary and can be quite expensive. What about this approach:
SELECT K.*, ISNULL(Au.AnzahlAuftr,0) AS AnzahlAuftr, ISNULL(Au.AnzahlFahrt,0) AS AnzahlFahrt, ISNULL(Au.SumStrecke,0) AS SumStrecke
FROM Kunde K
LEFT OUTER JOIN
(SELECT A.Au_Kd_ID, COUNT(*) AS AnzahlAuftr, SUM(Fa.AnzahlFahrt1) AS AnzahlFahrt, SUM(Fa.SumStrecke2) AS SumStrecke
FROM Auftrag A LEFT OUTER JOIN
(SELECT F.F_Au_ID, COUNT(*) AS AnzahlFahrt1, SUM(Ts.SumStrecke1) AS SumStrecke2
FROM Fahrten F LEFT OUTER JOIN
(SELECT T.Ts_F_ID, SUM(T.Ts_Strecke) AS SumStrecke1
FROM Teilstrecke T
GROUP BY T.Ts_F_ID) AS Ts
ON Ts.Ts_F_ID = F.F_ID
GROUP BY F.F_Au_ID) AS Fa
ON Fa.F_Au_ID = A.Au_ID
GROUP BY A.Au_Kd_ID) AS Au
ON Au.Au_Kd_ID = K.Kd_ID

avoid multiple union all in cte

Hi I have a CTE with 5 inner joins and a where clause which is reducing by one.
the sample code looks like below. but the actual code has more complex logic
;With CTE_EG AS
(
select *,
-1 as offset from a
inner join a1 on a1.id=a.id
inner join a2 on a1.id=a2.id
inner join a3 on a1.id=a3.id
where a1.offset = a2.quarter-1
union all
select *,
-2 as offset from a
inner join a1 on a1.id=a.id
inner join a2 on a1.id=a2.id
inner join a3 on a1.id=a3.id
where a1.offset = a2.quarter-2
union all
...
)
this repeats till offset -4 and a1.offset = a2.quarter-4.
How can I avoid the same code to be repeated for so many times for only one where clause value. the actualy query has 5 inner joins and total 5 union all.
I can not remove the union all because that will generate in some calculation discrepancy.
I want something like when we pass an integer value n , the selects in between union all should repeat with the changing where clause like a1.offset = a2.quarter-2 to a1.offset = a2.quarter-n
Please suggest
This should just be:
;With Numbers(n) as (
select 1 union all select 2 union all
select 3 union all select 4
), CTE_EG AS
(
select *,
-n as offset from a
inner join a1 on a1.id=a.id
inner join a2 on a1.id=a2.id
inner join a3 on a1.id=a3.id
inner join numbers n on a1.offset = a2.quarter-n
)
I don't understand your point about not being able to remove the UNION ALL.