Why using COUNT with SELF JOIN gives different result value - tsql

Can somebody explain me why if I use SELF JOIN and COUNT it gives me different result than just using COUNT command?
Same table with ControlNo column. The value in a column is NOT Unique.
This query gives me total counts 15586.
select (Select COUNT(ControlNo)
from tblQuotes Q1
where Q1.ControlNo = a.ControlNo
) QuotedTotal
FROM tblQuotes a
inner join lstlines l on a.LineGUID = l.LineGUID
where l.LineName = 'EARTHQUAKE' AND YEAR(EffectiveDate) = 2016
But then, if I run this query it gives me total counts of 15095.
select COUNT(ControlNo) as QuotedTotal
from tblQuotes a
inner join lstlines l on a.LineGUID = l.LineGUID
where l.LineName = 'EARTHQUAKE' AND YEAR(EffectiveDate) = 2016
What exactly changing the total amount and why?
And why would I use the first scenario?
And is any way to modify the first query to get the sum of 15586 without breaking down by each row?
Thank you

It seems to be because field ControlNo is not unique and there are some records sharing that value, although not all of them join against the lstlines table with that condition. So basically your last query does:
SELECT COUNT(a.ControlNo)
FROM lstlines l
INNER JOIN tblQuotes a ON a.LineGUID = l.LineGUID
WHERE l.LineName = 'EARTHQUAKE' AND YEAR(EffectiveDate) = 2016
While the first one basically does:
SELECT COUNT(b.ControlNo)
FROM lstlines l
INNER JOIN tblQuotes a ON a.LineGUID = l.LineGUID
INNER JOIN tblQuotes b ON a.ControlNo = b.ControlNo
WHERE l.LineName = 'EARTHQUAKE' AND YEAR(EffectiveDate) = 2016
As you can see, in this second query you are not only counting the rows that match your lstlines table, but also all the rows in tblQuotes which have the same ControlNo as those who match against lstlines.

Related

How can I list other matching values ​even if there is an unmatched value in the query?

In my query there is a value that will not match in the demand category table. Therefore, since one value does not match in the output of my query, other matching values ​​do not appear.
I want to do;
How can I list other matching values ​​even if there is an unmatched value in the query?
process Table
fk_unit_id fk_unit_position fk_demand_category
1 2 1
unit table
unit_id
1
unit_position table
unit_position
2
demand_category table
demand_category
1
Query:
SELECT unit_name,unit_position_name,demand_category_name From process
INNER JOIN unit ON process.fk_unit_id = unit_id and unit_id =1
INNER JOIN unit_position ON process.fk_unit_position_id = unit_position_id and unit_position_id = 2
INNER JOIN demand_category ON process.fk_demand_category_id = demand_category_id and demand_category_id =0 ;
Switch INNER JOIN on demand_category with LEFT JOIN
LEFT JOIN gets all records from the LEFT linked and the related record from the right table ,but if you have selected some columns from the RIGHT table, if there is no related records, these columns will contain NULL.
SELECT unit_name,unit_position_name,demand_category_name From process
INNER JOIN unit ON process.fk_unit_id = unit_id and unit_id =1
INNER JOIN unit_position ON process.fk_unit_position_id = unit_position_id and unit_position_id = 2
LEFT JOIN demand_category ON process.fk_demand_category_id = demand_category_id and demand_category_id =0 ;
You can use outer join to have the columns that don't match, just the corresponding values in other table will be padded with null. Other way is to use IN operator, but slower query performance.

How to count total number of records after join the three tables in postgresql?

I have a query which gives me total 12408 records after executing but i want this give me total records as count column
select
c.complaint_id,c.server_time,c.completion_date,c.road_id,c.photo,c.dept_code,c.dist_code,c.eng_userid,c.feedback_type,c.status,p.dist_name,p.road_name,p.road_dept,e.display_name,e.mobile
from complaints as c INNER JOIN pwd_roads as p ON p.road_id=c.road_id
INNER JOIN enc_details as e ON CAST(e.enc_code as INTEGER) = p.enccode
where c.complaint_id=c.parent_complaint_id and c.dept_code='PWDBnR'
and c.server_time between '2018-09-03' and '2018-12-19'
You can solve this issue using window functions. For example, if you want your first columns to be a count of the total rows done by the SELECT statement:
select count(1) over(range between unbounded preceding and unbounded following) as total_row_count
, c.complaint_id,c.server_time,c.completion_date,c.road_id,c.photo,c.dept_code,c.dist_code,c.eng_userid,c.feedback_type,c.status,p.dist_name,p.road_name,p.road_dept,e.display_name,e.mobile from complaints as c INNER JOIN pwd_roads as p ON p.road_id=c.road_id INNER JOIN enc_details as e ON CAST(e.enc_code as INTEGER) = p.enccode where c.complaint_id=c.parent_complaint_id and c.dept_code='PWDBnR' and c.server_time between '2018-09-03' and '2018-12-19'
Note that the window function is evaluated before the LIMIT clause if one is used, so if you were to add LIMIT 100 to the query it might give a row count greater than 100 even though a max of 100 rows would be returned.
Easiest but not very elegant way to do this is:
select count(*)
from
(
select c.complaint_id,c.server_time,c.completion_date,c.road_id,c.photo,c.dept_code,c.dist_code,c.eng_userid,c.feedback_type,c.status,p.dist_name,p.road_name,p.road_dept,e.display_name,e.mobile from complaints as c INNER JOIN pwd_roads as p ON p.road_id=c.road_id INNER JOIN enc_details as e ON CAST(e.enc_code as INTEGER) = p.enccode where c.complaint_id=c.parent_complaint_id and c.dept_code='PWDBnR' and c.server_time between '2018-09-03' and '2018-12-19'
)

Avoid duplication in SQL Server

I got the below result when i run this query.
SELECT DISTINCT PT.F_PRO AS F_PRODUCT, PT.F_TEXT_CODE AS F_TEXT_CODE, PHT.F_PHRASE AS F_PHRASE FROM T_PROD_TEXT PT
LEFT JOIN T_P_LINKAGE PHL
ON PT.F_TEXT_CODE = PHL.F_TEXT_CODE
INNER JOIN T_P_TRANSLATIONS PHT
ON PHL.F_PHRASE_ID = PHT.F_PHRASE_ID
WHERE PT.F_DATA_CODE = 'MANU' AND PHT.F_LANGUAGE = 'EN'
OUTPUT
F_PRODUCT F_TEXT_CODE F_PHRASE
294264_B MANU0008 Alcoa, Inc
294264_B MANU0012 BioSensory
00091A MANU0006 3M Company
00094A MANU0006 4M Company
00094A MANU0006 5M Company
The above query returns duplication in F_PRODUCT COLUMN.i want to display F_product without duplication. only one record should display for each F_product.(First record) without using top command
Required Output
F_PRODUCT F_TEXT_CODE F_PHRASE
294264_B MANU0008 Alcoa, Inc.
00091A MANU0006 3M Company|par
You can use row_number() to assign a number to each row within a group of f_pro. Then retrieve only rows that are number 1. You can change the order by if something else determines the order.
SELECT *
FROM
(SELECT PT.F_PRO AS F_PRODUCT, PT.F_TEXT_CODE AS F_TEXT_CODE, PHT.F_PHRASE AS F_PHRASE, ROW_NUMBER() OVER (PARTITION BY PT.F_PRO ORDER BY PHT.F_PHRASE ASC) AS RowNum
FROM T_PROD_TEXT PT
LEFT JOIN T_P_LINKAGE PHL
ON PT.F_TEXT_CODE = PHL.F_TEXT_CODE
INNER JOIN T_P_TRANSLATIONS PHT
ON PHL.F_PHRASE_ID = PHT.F_PHRASE_ID
WHERE PT.F_DATA_CODE = 'MANU' AND PHT.F_LANGUAGE = 'EN') dt
WHERE RowNum = 1
SELECT PT.F_PRO AS F_PRODUCT,
MIN(PT.F_TEXT_CODE) AS F_TEXT_CODE,
MIN(PHT.F_PHRASE) AS F_PHRASE FROM T_PROD_TEXT PT
LEFT JOIN T_P_LINKAGE PHL
ON PT.F_TEXT_CODE = PHL.F_TEXT_CODE
INNER JOIN T_P_TRANSLATIONS PHT
ON PHL.F_PHRASE_ID = PHT.F_PHRASE_ID
WHERE PT.F_DATA_CODE = 'MANU' AND PHT.F_LANGUAGE = 'EN'
group By PT.F_PRO;
is one way to do that. It doesn't do it for the "FIRST" since it is vague how would you define the "FIRST".

TSQL efficiency - INNER JOIN replaced by EXISTS

Can the following be rewritten to be more efficient?
I would use EXISTS if I didn't need fields from country but I do need those fields, and am not sure how to write this to make it more efficient.
SELECT distinct
p.ProvinceID,
p.Abbv as RegionCode,
p.name as RegionName,
cn.Code as CountryCode,
cn.Name as CountryName
FROM dbo.provinces AS p
INNER JOIN dbo.Countries AS cn ON p.CountryID = cn.CountryID
INNER JOIN dbo.Cities c on c.ProvinceID = p.ProvinceID
INNER JOIN dbo.Listings AS l ON l.CityID = c.CityID
WHERE l.IsActive = 1 AND l.IsApproved = 1
There are two things to note:
You're joining to dbo.Listings which results in many records, so you need to use DISTINCT (usually an expensive operator)
For any tables with columns not in the select you can move into an EXISTS (but the query planner effectively does this for you anyway)
So try this:
SELECT
p.ProvinceID,
p.Abbv as RegionCode,
p.name as RegionName,
cn.Code as CountryCode,
cn.Name as CountryName
FROM dbo.provinces AS p
INNER JOIN
dbo.Countries AS cn
ON p.CountryID = cn.CountryID
WHERE EXISTS (SELECT 1 FROM
dbo.Listings l
INNER JOIN dbo.Cities c
on l.CityID = c.CityID
WHERE c.ProvinceID = p.ProvinceID
AND l.IsActive = 1 AND l.IsApproved = 1
)
Check the query plans before and after - the query planner might be smart enough to do this anyway, but you have removed your distinct
The following will often perform even better by providing the optimizer more useful information:
SELECT
p.ProvinceID,
p.Abbv as RegionCode,
p.name as RegionName,
cn.Code as CountryCode,
cn.Name as CountryName
FROM dbo.provinces AS p
INNER JOIN
dbo.Countries AS cn
ON p.CountryID = cn.CountryID
INNER JOIN (
SELECT
p.ProvinceID
FROM
dbo.Listings l
INNER JOIN dbo.Cities c
on l.CityID = c.CityID
WHERE l.IsActive = 1 AND l.IsApproved = 1
GROUP BY
p.ProvinceID
) list
on list.ProvinceID = p.ProvinceID

TSQL Msg 1013 "Use correlation names to distinguish them."

I looked trough many suggestions and can't figure how to solve this one for the last two hours.
SET DATEFORMAT DMY
DECLARE #Source DATETIME = '01/01/2001'
DECLARE #Destenaition DATETIME = '01/01/2020'
SELECT ST.[Group],
ST.Shop,
SUM(ST.Purchased) AS Total,
CHG.Charged
FROM (SELECT Personals.Groups.[Name] AS 'Group',
Cards.vPurchases.PersonalID,
Personals.Registry.[Name],
SUM(Cards.vPurchases.Ammont) AS Purchased,
Cards.vPurchases.ShopName AS Shop
FROM Cards.vPurchases
INNER JOIN Personals.Registry
ON Personals.Registry.Id = Cards.vPurchases.PersonalID
INNER JOIN Personals.Groups
ON Personals.Registry.[Group] = Personals.Groups.Id
INNER JOIN Personals.Groups
ON Personals.Groups.Id = CHG.GroupID
WHERE Cards.vPurchases.[TimeStamp] >= #Source
AND Cards.vPurchases.[TimeStamp] <= #Destenaition
GROUP BY Cards.vPurchases.PersonalID,
Personals.Registry.[Name],
Personals.Groups.[Name],
Cards.vPurchases.ShopName) ST,
(SELECT PG.Id AS GroupID,
SUM(Cards.vCharges.Amount) AS Charged
FROM Cards.vCharges
INNER JOIN Personals.Registry
ON Personals.Registry.Id = Cards.vCharges.PersonalID
INNER JOIN Personals.Groups AS PG
ON Personals.Registry.[Group] = PG.Id
WHERE Cards.vCharges.[TimeStamp] >= #Source
AND Cards.vCharges.[TimeStamp] <= #Destenaition
GROUP BY Personals.Groups.[Name]) AS CHG
GROUP BY ST.Shop,
ST.[Group]
And then I get this error:
Msg 1013, Level 16, State 1, Line 6 The objects "Personals.Groups" and
"Personals.Groups" in the FROM clause have the same exposed names. Use
correlation names to distinguish them.
Thanks.
You are using the table Personals.Groups two times in the first sub query.
If you really mean to have the table Personals.Groups you need to give them an alias that you then use instead of the table names in the rest of the query.
INNER JOIN Personals.Groups as PG1
and
INNER JOIN Personals.Groups as PG2
If you only need one you can combine the on clauses to use just one instead.
INNER JOIN Personals.Groups
ON Personals.Registry.[Group] = Personals.Groups.Id and
Personals.Groups.Id = CHG.GroupID