Postgres join not respecting outer where clause - postgresql

In SQL Server, I know for sure that the following query;
SELECT things.*
FROM things
LEFT OUTER JOIN (
SELECT thingreadings.thingid, reading
FROM thingreadings
INNER JOIN things on thingreadings.thingid = things.id
ORDER BY reading DESC LIMIT 1) AS readings
ON things.id = readings.thingid
WHERE things.id = '1'
Would join against thingreadings only once the WHERE id = 1 had restricted the record set down. It left joins against just one row. However in order for performance to be acceptable in postgres, I have to add the WHERE id= 1 to the INNER JOIN things on thingreadings.thingid = things.id line too.
This isn't ideal; is it possible to force postgres to know that what I am joining against is only one row without explicitly adding the WHERE clauses everywhere?
An example of this problem can be seen here;
I am trying to recreate the following query in a more efficient way;
SELECT things.id, things.name,
(SELECT thingreadings.id FROM thingreadings WHERE thingid = things.id ORDER BY id DESC LIMIT 1),
(SELECT thingreadings.reading FROM thingreadings WHERE thingid = things.id ORDER BY id DESC LIMIT 1)
FROM things
WHERE id IN (1,2)
http://sqlfiddle.com/#!15/a172c/2

Not really sure why you did all that work. Isn't the inner query enough?
SELECT t.*
FROM thingreadings tr
INNER JOIN things t on tr.thingid = t.id AND t.id = '1'
ORDER BY tr.reading DESC
LIMIT 1;
sqlfiddle demo
When you want to select the latest value for each thingID, you can do:
SELECT t.*,a.reading
FROM things t
INNER JOIN (
SELECT t1.*
FROM thingreadings t1
LEFT JOIN thingreadings t2
ON (t1.thingid = t2.thingid AND t1.reading < t2.reading)
WHERE t2.thingid IS NULL
) a ON a.thingid = t.id
sqlfiddle demo
The derived table gets you the record with the most recent reading, then the JOIN gets you the information from things table for that record.

The where clause in SQL applies to the result set you're requesting, NOT to the join.
What your code is NOT saying: "do this join only for the ID of 1"...
What your code IS saying: "do this join, then pull records out of it where the ID is 1"...
This is why you need the inner where clause. Incidentally, I also think Filipe is right about the unnecessary code.

Related

How to get unique rows by one column but sort by the second

There is an example request in which there are several joins.
SELECT DISTINCT ON(a.id_1) 1, a.name, b.task, c.created_at
FROM a
INNER JOIN b ON a.id_2 = b.id
INNER JOIN c ON a.ID_2 = c.id
WHERE a.deleted_at IS NULL
ORDER BY a.id_1 desc
In this case, the query will work, sorting by unique values ​​of id_1 will take place. But I need to sort by the column a.name. In this case, postresql will swear with the words ERROR: SELECT DISTINCT ON expressions must match initial ORDER BY expressions.
The following query can serve as a solution to the problem:
SELECT *
FROM(
SELECT DISTINCT ON(a.id_1) a.name, b.task, c.created_at
FROM a
INNER JOIN b ON a.id_2 = b.id
INNER JOIN c ON a.ID_2 = c.id
WHERE a.deleted_at IS NULL
)
ORDER_BY a.name desc
But in reality the database is very large and such a query is not optimal. Are there other ways to sort by the selected column while keeping one uniqueness?

MariaDB - order by with more selects

I have this SQL:
select * from `posts`
where `posts`.`deleted_at` is null
and `expire_at` >= '2017-03-26 21:23:42.000000'
and (
select count(distinct tags.id) from `tags`
inner join `post_tag` on `tags`.`id` = `post_tag`.`tag_id`
where `post_tag`.`post_id` = `posts`.`id`
and (`tags`.`tag` like 'PHP' or `tags`.`tag` like 'pop' or `tags`.`tag` like 'UI')
) >= 1
Is it possible order the results by number of tags in posts?
Maybe add there alias?
Any information can help me.
Convert your correlated subquery into a join:
select p.*
from posts p
join (
select pt.post_id,
count(distinct t.id) as tag_count
from tags t
inner join post_tag pt on t.id = pt.tag_id
where t.tag in ('PHP', 'pop', 'UI')
group by pt.post_id
) pt on p.id = pt.post_id
where p.deleted_at is null
and p.expire_at >= '2017-03-26 21:23:42.000000'
order by pt.tag_count desc;
Also, note that I changed the bunch of like and or to single IN because you are not matching any pattern i.e. there is no % in the string. So, better using single IN instead.
Also, if you have defined your table names, column names etc keeping keywords etc in mind, you shouldn't have the need to use the backticks. They make reading a query difficult.

Get maximum value of an aggregate function

I want to only return the row where the count(object) is the highest, so I have written this query
select klantnr, count(objectnaam)
from klanten inner join deelnames using(klantnr)
inner join reizen using(reisnr)
inner join bezoeken using(reisnr)
where objectnaam = 'Maan'
group by klantnr
Now, I can't do
select max(count(objectnaam))
How would I go about solving this problem?
I have tried by using a subquery which is equally invalid
select max(select count(objectnaam) from ....)
I think I need a subquery in the from, so I have rewritten the query like this which I think is closer to the actual answer but still not right, as now it returns the maximum value of all rows.
select klantnr, max(c)
FROM(
select klantnr, count(objectnaam) as c
from klanten inner join deelnames using(klantnr)
inner join reizen using(reisnr)
inner join bezoeken using(reisnr)
where objectnaam = 'Maan'
group by klantnr) as F
group by klantnr
thanks for any help you can give me!
You do not provide the structure of tables, so probably you have to modify the following query. However it works just for PostgreSQL 9.x+
WITH t AS (
SELECT klantnr, COUNT(objectnaam) AS c
FROM klanten
WHERE objectnaam = 'Maan'
GROUP BY klantnr
ORDER BY c DESC
LIMIT 1
)
SELECT * FROM t
INNER JOIN deelnames USING(klantnr)
INNER JOIN reizen USING(reisnr)
INNER JOIN bezoeken USING(reisnr);
see http://www.postgresql.org/docs/9.3/static/queries-with.html how to use WITH QUERIES.
I have found a simpeler solution:
select klantnr,count (klantnr)
from bezoeken natural join deelnames
where objectnaam ='Maan'
group by klantnr
order by count desc
limit 1

How to design a SQL recursive query?

How would I redesign the below query so that it will recursively loop through entire tree to return all descendants from root to leaves? (I'm using SSMS 2008). We have a President at the root. under him are the VPs, then upper management, etc., on down the line. I need to return the names and titles of each. But this query shouldn't be hard-coded; I need to be able to run this for any selected employee, not just the president. This query below is the hard-coded approach.
select P.staff_name [Level1],
P.job_title [Level1 Title],
Q.license_number [License 1],
E.staff_name [Level2],
E.job_title [Level2 Title],
G.staff_name [Level3],
G.job_title [Level3 Title]
from staff_view A
left join staff_site_link_expanded_view P on P.people_id = A.people_id
left join staff_site_link_expanded_view E on E.people_id = C.people_id
left join staff_site_link_expanded_view G on G.people_id = F.people_id
left join facility_view Q on Q.group_profile_id = P.group_profile_id
Thank you, this was most closely matching what I needed. Here is my CTE query below:
with Employee_Hierarchy (staff_name, job_title, id_number, billing_staff_credentials_code, site_name, group_profile_id, license_number, region_description, people_id)
as
(
select C.staff_name, C.job_title, C.id_number, C.billing_staff_credentials_code, C.site_name, C.group_profile_id, Q.license_number, R.region_description, A.people_id
from staff_view A
left join staff_site_link_expanded_view C on C.people_id = A.people_id
left join facility_view Q on Q.group_profile_id = C.group_profile_id
left join regions R on R.regions_id = Q.regions_id
where A.last_name = 'kromer'
)
select C.staff_name, C.job_title, C.id_number, C.billing_staff_credentials_code, C.site_name, C.group_profile_id, Q.license_number, R.region_description, A.people_id
from staff_view A
left join staff_site_link_expanded_view C on C.people_id = A.people_id
left join facility_view Q on Q.group_profile_id = C.group_profile_id
left join regions R on R.regions_id = Q.regions_id
WHERE C.STAFF_NAME IS NOT NULL
GROUP BY C.STAFF_NAME, C.job_title, C.id_number, C.billing_staff_credentials_code, C.site_name, C.group_profile_id, Q.license_number, R.region_description, A.people_id
ORDER BY C.STAFF_NAME
But I am wondering what is the purpose of the "Employee_Hierarchy"? When I replaced "staff_view" in the outer query with "Employee_Hierarchy", it only returned one record = "Kromer". So when/where can we use "Employee_Hierarchy"?
See:
SQL Server - Simple example of a recursive CTE
MSDN: Recursive Queries using Common Table Expression
SQL Server recursive CTE (this seems pretty much like exactly what you are working on!)
Update:
A proper recursive CTE consist of basically three things:
an anchor SELECT to begin with; that can select e.g. the root level employees (where the Reports_To is NULL), or it can select any arbitrary employee that you define, e.g. by a parameter
a UNION ALL
a recursive SELECT statement that selects from the same, typically self-referencing table and joins with the recursive CTE being currently built up
This gives you the ability to recursively build up a result set that you can then select from.
If you look at the Northwind sample database, it has a table called Employees which is self-referencing: Employees.ReportsTo --> Employees.EmployeeID defines who reports to whom.
Your CTE would look something like this:
;WITH RecursiveCTE AS
(
-- anchor query; get the CEO
SELECT EmployeeID, FirstName, LastName, Title, 1 AS 'Level', ReportsTo
FROM dbo.Employees
WHERE ReportsTo IS NULL
UNION ALL
-- recursive part; select next Employees that have ReportsTo -> cte.EmployeeID
SELECT
e.EmployeeID, e.FirstName, e.LastName, e.Title,
cte.Level + 1 AS 'Level', e.ReportsTo
FROM
dbo.Employees e
INNER JOIN
RecursiveCTE cte ON e.ReportsTo = cte.EmployeeID
)
SELECT *
FROM RecursiveCTE
ORDER BY Level, LastName
I don't know if you can translate your sample to a proper recursive CTE - but that's basically the gist of it: anchor query, UNION ALL, recursive query

T-SQL Subquery on Latest Date with InnerJoin

I want to do a similiar thing like this guy:
T-SQL Subquery Max(Date) and Joins
I have to do this with an n:m relation.
So the layout is:
tbl_Opportunity
tbl_Opportunity_tbl_OpportunityData
tbl_OpportunityData
So as you see there is an intersection table which connects opportunity with opportunitydata.
For every opportunity there are multiple opportunity datas. In my view i only want a list with all opportunites and the data from the latest opportunity datas.
I tried something like this:
SELECT
dbo.tbl_Opportunity.Id, dbo.tbl_Opportunity.Subject,
dbo.tbl_User.UserName AS Responsible, dbo.tbl_Contact.Name AS Customer,
dbo.tbl_Opportunity.CreationDate, dbo.tbl_Opportunity.ActionDate AS [Planned Closure],
dbo.tbl_OpportunityData.Volume,
dbo.tbl_OpportunityData.ChangeDate, dbo.tbl_OpportunityData.Chance
FROM
dbo.tbl_Opportunity
INNER JOIN
dbo.tbl_User ON dbo.tbl_Opportunity.Creator = dbo.tbl_User.Id
INNER JOIN
dbo.tbl_Contact ON dbo.tbl_Opportunity.Customer = dbo.tbl_Contact.Id
INNER JOIN
dbo.tbl_Opprtnty_tbl_OpprtnityData ON dbo.tbl_Opportunity.Id = dbo.tbl_Opprtnty_tbl_OpprtnityData.Id
INNER JOIN
dbo.tbl_OpportunityData ON dbo.tbl_Opprtnty_tbl_OpprtnityData.Id2 = dbo.tbl_OpportunityData.Id
The problem is my view now includes a row for every opportunity data, since I don't know how to filter that I only want the latest data.
Can you help me? is my problem description clear enough?
thank you in advance :-)
best wishes,
laurin
; WITH Base AS (
SELECT dbo.tbl_Opportunity.Id, dbo.tbl_Opportunity.Subject, dbo.tbl_User.UserName AS Responsible, dbo.tbl_Contact.Name AS Customer,
dbo.tbl_Opportunity.CreationDate, dbo.tbl_Opportunity.ActionDate AS [Planned Closure], dbo.tbl_OpportunityData.Volume,
dbo.tbl_OpportunityData.ChangeDate, dbo.tbl_OpportunityData.Chance
FROM dbo.tbl_Opportunity INNER JOIN
dbo.tbl_User ON dbo.tbl_Opportunity.Creator = dbo.tbl_User.Id INNER JOIN
dbo.tbl_Contact ON dbo.tbl_Opportunity.Customer = dbo.tbl_Contact.Id INNER JOIN
dbo.tbl_Opprtnty_tbl_OpprtnityData ON dbo.tbl_Opportunity.Id = dbo.tbl_Opprtnty_tbl_OpprtnityData.Id INNER JOIN
dbo.tbl_OpportunityData ON dbo.tbl_Opprtnty_tbl_OpprtnityData.Id2 = dbo.tbl_OpportunityData.Id
)
, OrderedByDate AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Id ORDER BY ChangeDate DESC) RN FROM Base
)
SELECT * FROM OrderedByDate WHERE RN = 1
To make it more readable I'm using CTE (the WITH part). In the end the real "trick" is doing a ROW_NUMBER() partitioning the data by tbl_Opportunity.Id and ordering the partitions by ChangeDate DESC (and I call it RN). Clearly the maximum date in each partition will be RN = 1 and then we filter it by RN.
Without using CTE it will be something like this:
SELECT * FROM (
SELECT dbo.tbl_Opportunity.Id, dbo.tbl_Opportunity.Subject, dbo.tbl_User.UserName AS Responsible, dbo.tbl_Contact.Name AS Customer,
dbo.tbl_Opportunity.CreationDate, dbo.tbl_Opportunity.ActionDate AS [Planned Closure], dbo.tbl_OpportunityData.Volume,
dbo.tbl_OpportunityData.ChangeDate, dbo.tbl_OpportunityData.Chance,
ROW_NUMBER() OVER (PARTITION BY dbo.tbl_Opportunity.Id ORDER BY dbo.tbl_OpportunityData.ChangeDate DESC) RN
FROM dbo.tbl_Opportunity INNER JOIN
dbo.tbl_User ON dbo.tbl_Opportunity.Creator = dbo.tbl_User.Id INNER JOIN
dbo.tbl_Contact ON dbo.tbl_Opportunity.Customer = dbo.tbl_Contact.Id INNER JOIN
dbo.tbl_Opprtnty_tbl_OpprtnityData ON dbo.tbl_Opportunity.Id = dbo.tbl_Opprtnty_tbl_OpprtnityData.Id INNER JOIN
dbo.tbl_OpportunityData ON dbo.tbl_Opprtnty_tbl_OpprtnityData.Id2 = dbo.tbl_OpportunityData.Id
) AS Base WHERE RN = 1
The statement can be simplified for one more step further:
SELECT TOP 1 WITH TIES
dbo.tbl_Opportunity.Id, dbo.tbl_Opportunity.Subject, dbo.tbl_User.UserName AS Responsible,
dbo.tbl_Contact.Name AS Customer, dbo.tbl_Opportunity.CreationDate,
dbo.tbl_Opportunity.ActionDate AS [Planned Closure], dbo.tbl_OpportunityData.Volume,
dbo.tbl_OpportunityData.ChangeDate, dbo.tbl_OpportunityData.Chance
FROM
dbo.tbl_Opportunity INNER JOIN
dbo.tbl_User ON dbo.tbl_Opportunity.Creator = dbo.tbl_User.Id INNER JOIN
dbo.tbl_Contact ON dbo.tbl_Opportunity.Customer = dbo.tbl_Contact.Id INNER JOIN
dbo.tbl_Opprtnty_tbl_OpprtnityData ON dbo.tbl_Opportunity.Id = dbo.tbl_Opprtnty_tbl_OpprtnityData.Id INNER JOIN
dbo.tbl_OpportunityData ON dbo.tbl_Opprtnty_tbl_OpprtnityData.Id2 = dbo.tbl_OpportunityData.Id
ORDER BY
ROW_NUMBER() OVER (PARTITION BY dbo.tbl_Opportunity.Id ORDER BY dbo.tbl_OpportunityData.ChangeDate DESC);