T-SQL Derived tables - tsql

I'm relatively new to derived tables when querying in From/Join clause as I always thought that Joins would eliminate the need for these subqueries. However, my question is that when I write a subquery within an inner join, do I need to specify the Joining column field within the subquery select statement to initiate a Join? I know you don't usually have to do this in a normal Join, however I wrote some sql code that won't execute unless I specify the joining column in the subquery select statement (I've bolded this). I've pasted the code below.
select pc.category_name
,product_name
,pp.list_price
,avg(quantity * (oi.list_price * (1-discount))) as Average_Revenue
,sum(quantity) as [products sold]
,sum(quantity * (oi.list_price * (1-discount))) as Revenue
,dt.Average_Category_Revenue
from production.categories as pc
inner join production.products as pp
on pc.category_id = pp.category_id
inner join sales.order_items as oi
on pp.product_id = oi.product_id
inner join (
select
category_name
,**pcc.category_id**
,avg(quantity * (oii.list_price * (1-discount))) as Average_Category_Revenue
from production.categories as pcc
inner join production.products as ppp
on pcc.category_id = ppp.category_id
inner join sales.order_items as oii
on ppp.product_id = oii.product_id
group by category_name, pcc.category_id
) as dt
on pp.category_id = dt.category_id
group by pc.category_name, product_name, pp.list_price, dt.Average_Category_Revenue
order by sum(quantity * (oi.list_price * (1-discount))) DESC

Related

SQL left join on maximum date

I have two tables: contracts and contract_descriptions.
On contract_descriptions there is a column named contract_id which is equal on contracts table records.
I am trying to join the latest record on contract_descriptions:
SELECT *
FROM contracts c
LEFT JOIN contract_descriptions d ON d.contract_id = c.contract_id
AND d.date_description =
(SELECT MAX(date_description)
FROM contract_descriptions t
WHERE t.contract_id = c.contract_id)
It works, but is it the performant way to do it? Is there a way to avoid the second SELECT?
You could also alternatively use DISTINCT ON:
SELECT * FROM contracts c LEFT JOIN (
SELECT DISTINCT ON (cd.contract_id) cd.* FROM contract_descriptions cd
ORDER BY cd.contract_id, cd.date_description DESC
) d ON d.contract_id = c.contract_id
DISTINCT ON selects only one row per contract_id while the sort clause cd.date_description DESC ensures that it is always the last description.
Performance depends on many values (for example, table size). In any case, you should compare both approaches with EXPLAIN.
Your query looks okay to me. One typical way to join only n rows by some order from the other table is a lateral join:
SELECT *
FROM contracts c
CROSS JOIN LATERAL
(
SELECT *
FROM contract_descriptions cd
WHERE cd.contract_id = c.contract_id
ORDER BY cd.date_description DESC
FETCH FIRST 1 ROW ONLY
) cdlast;

Lateral query syntax

I'm trying to get lateral to work in a Postgres 9.5.3 query.
select b_ci."IdOwner",
ci."MinimumPlaces",
ci."MaximumPlaces",
(select count(*) from "LNK_Stu_CI" lnk
where lnk."FK_CourseInstanceId" = b_ci."Id") as "EnrolledStudents",
from "Course" c
join "DBObjectBases" b_c on c."Id" = b_c."Id"
join "DBObjectBases" b_ci on b_ci."IdOwner" = b_c."Id"
join "CourseInstance" ci on ci."Id" = b_ci."Id",
lateral (select ci."MaximumPlaces" - "EnrolledStudents") x
I want the right-most column to be the result of "MaximumPlaces" - "EnrolledStudents" for that row but am struggling to get it to work. At the moment PG is complaining that "EnrolledStudents" does not exist - which is exactly the point of "lateral", isn't it?
select b_ci."IdOwner",
ci."MinimumPlaces",
ci."MaximumPlaces",
(select count(*) from "LNK_Stu_CI" lnk
where lnk."FK_CourseInstanceId" = b_ci."Id") as "EnrolledStudents",
lateral (select "MaximumPlaces" - "EnrolledStudents") as "x"
from "Course" c
join "DBObjectBases" b_c on c."Id" = b_c."Id"
join "DBObjectBases" b_ci on b_ci."IdOwner" = b_c."Id"
join "CourseInstance" ci on ci."Id" = b_ci."Id"
If I try inlining the lateral clause (shown above) in the select it gets upset too and gives me a syntax error - so where does it go?
Thanks,
Adam.
You are missing the point with LATERAL. It can access columns in tables in the FROM clause, but not aliases defined in SELECT clause.
If you want to access alias defined in SELECT clause, you need to add another query level, either using a subquery in FROM clause (AKA derived table) or using a CTE (Common Table Expression). As CTE in PostgreSQL acts as an optimization fence, I strongly recommend going with subquery in this case, like:
select
-- get all columns on the inner query
t.*,
-- get your new expression based on the ones defined in the inner query
t."MaximumPlaces" - t."EnrolledStudents" AS new_alias
from (
select b_ci."IdOwner",
ci."MinimumPlaces",
ci."MaximumPlaces",
(select count(*) from "LNK_Stu_CI" lnk
where lnk."FK_CourseInstanceId" = b_ci."Id") as "EnrolledStudents",
from "Course" c
join "DBObjectBases" b_c on c."Id" = b_c."Id"
join "DBObjectBases" b_ci on b_ci."IdOwner" = b_c."Id"
join "CourseInstance" ci on ci."Id" = b_ci."Id"
) t

Get maximum value of an aggregate function

I want to only return the row where the count(object) is the highest, so I have written this query
select klantnr, count(objectnaam)
from klanten inner join deelnames using(klantnr)
inner join reizen using(reisnr)
inner join bezoeken using(reisnr)
where objectnaam = 'Maan'
group by klantnr
Now, I can't do
select max(count(objectnaam))
How would I go about solving this problem?
I have tried by using a subquery which is equally invalid
select max(select count(objectnaam) from ....)
I think I need a subquery in the from, so I have rewritten the query like this which I think is closer to the actual answer but still not right, as now it returns the maximum value of all rows.
select klantnr, max(c)
FROM(
select klantnr, count(objectnaam) as c
from klanten inner join deelnames using(klantnr)
inner join reizen using(reisnr)
inner join bezoeken using(reisnr)
where objectnaam = 'Maan'
group by klantnr) as F
group by klantnr
thanks for any help you can give me!
You do not provide the structure of tables, so probably you have to modify the following query. However it works just for PostgreSQL 9.x+
WITH t AS (
SELECT klantnr, COUNT(objectnaam) AS c
FROM klanten
WHERE objectnaam = 'Maan'
GROUP BY klantnr
ORDER BY c DESC
LIMIT 1
)
SELECT * FROM t
INNER JOIN deelnames USING(klantnr)
INNER JOIN reizen USING(reisnr)
INNER JOIN bezoeken USING(reisnr);
see http://www.postgresql.org/docs/9.3/static/queries-with.html how to use WITH QUERIES.
I have found a simpeler solution:
select klantnr,count (klantnr)
from bezoeken natural join deelnames
where objectnaam ='Maan'
group by klantnr
order by count desc
limit 1

Postgres join not respecting outer where clause

In SQL Server, I know for sure that the following query;
SELECT things.*
FROM things
LEFT OUTER JOIN (
SELECT thingreadings.thingid, reading
FROM thingreadings
INNER JOIN things on thingreadings.thingid = things.id
ORDER BY reading DESC LIMIT 1) AS readings
ON things.id = readings.thingid
WHERE things.id = '1'
Would join against thingreadings only once the WHERE id = 1 had restricted the record set down. It left joins against just one row. However in order for performance to be acceptable in postgres, I have to add the WHERE id= 1 to the INNER JOIN things on thingreadings.thingid = things.id line too.
This isn't ideal; is it possible to force postgres to know that what I am joining against is only one row without explicitly adding the WHERE clauses everywhere?
An example of this problem can be seen here;
I am trying to recreate the following query in a more efficient way;
SELECT things.id, things.name,
(SELECT thingreadings.id FROM thingreadings WHERE thingid = things.id ORDER BY id DESC LIMIT 1),
(SELECT thingreadings.reading FROM thingreadings WHERE thingid = things.id ORDER BY id DESC LIMIT 1)
FROM things
WHERE id IN (1,2)
http://sqlfiddle.com/#!15/a172c/2
Not really sure why you did all that work. Isn't the inner query enough?
SELECT t.*
FROM thingreadings tr
INNER JOIN things t on tr.thingid = t.id AND t.id = '1'
ORDER BY tr.reading DESC
LIMIT 1;
sqlfiddle demo
When you want to select the latest value for each thingID, you can do:
SELECT t.*,a.reading
FROM things t
INNER JOIN (
SELECT t1.*
FROM thingreadings t1
LEFT JOIN thingreadings t2
ON (t1.thingid = t2.thingid AND t1.reading < t2.reading)
WHERE t2.thingid IS NULL
) a ON a.thingid = t.id
sqlfiddle demo
The derived table gets you the record with the most recent reading, then the JOIN gets you the information from things table for that record.
The where clause in SQL applies to the result set you're requesting, NOT to the join.
What your code is NOT saying: "do this join only for the ID of 1"...
What your code IS saying: "do this join, then pull records out of it where the ID is 1"...
This is why you need the inner where clause. Incidentally, I also think Filipe is right about the unnecessary code.

How to design a SQL recursive query?

How would I redesign the below query so that it will recursively loop through entire tree to return all descendants from root to leaves? (I'm using SSMS 2008). We have a President at the root. under him are the VPs, then upper management, etc., on down the line. I need to return the names and titles of each. But this query shouldn't be hard-coded; I need to be able to run this for any selected employee, not just the president. This query below is the hard-coded approach.
select P.staff_name [Level1],
P.job_title [Level1 Title],
Q.license_number [License 1],
E.staff_name [Level2],
E.job_title [Level2 Title],
G.staff_name [Level3],
G.job_title [Level3 Title]
from staff_view A
left join staff_site_link_expanded_view P on P.people_id = A.people_id
left join staff_site_link_expanded_view E on E.people_id = C.people_id
left join staff_site_link_expanded_view G on G.people_id = F.people_id
left join facility_view Q on Q.group_profile_id = P.group_profile_id
Thank you, this was most closely matching what I needed. Here is my CTE query below:
with Employee_Hierarchy (staff_name, job_title, id_number, billing_staff_credentials_code, site_name, group_profile_id, license_number, region_description, people_id)
as
(
select C.staff_name, C.job_title, C.id_number, C.billing_staff_credentials_code, C.site_name, C.group_profile_id, Q.license_number, R.region_description, A.people_id
from staff_view A
left join staff_site_link_expanded_view C on C.people_id = A.people_id
left join facility_view Q on Q.group_profile_id = C.group_profile_id
left join regions R on R.regions_id = Q.regions_id
where A.last_name = 'kromer'
)
select C.staff_name, C.job_title, C.id_number, C.billing_staff_credentials_code, C.site_name, C.group_profile_id, Q.license_number, R.region_description, A.people_id
from staff_view A
left join staff_site_link_expanded_view C on C.people_id = A.people_id
left join facility_view Q on Q.group_profile_id = C.group_profile_id
left join regions R on R.regions_id = Q.regions_id
WHERE C.STAFF_NAME IS NOT NULL
GROUP BY C.STAFF_NAME, C.job_title, C.id_number, C.billing_staff_credentials_code, C.site_name, C.group_profile_id, Q.license_number, R.region_description, A.people_id
ORDER BY C.STAFF_NAME
But I am wondering what is the purpose of the "Employee_Hierarchy"? When I replaced "staff_view" in the outer query with "Employee_Hierarchy", it only returned one record = "Kromer". So when/where can we use "Employee_Hierarchy"?
See:
SQL Server - Simple example of a recursive CTE
MSDN: Recursive Queries using Common Table Expression
SQL Server recursive CTE (this seems pretty much like exactly what you are working on!)
Update:
A proper recursive CTE consist of basically three things:
an anchor SELECT to begin with; that can select e.g. the root level employees (where the Reports_To is NULL), or it can select any arbitrary employee that you define, e.g. by a parameter
a UNION ALL
a recursive SELECT statement that selects from the same, typically self-referencing table and joins with the recursive CTE being currently built up
This gives you the ability to recursively build up a result set that you can then select from.
If you look at the Northwind sample database, it has a table called Employees which is self-referencing: Employees.ReportsTo --> Employees.EmployeeID defines who reports to whom.
Your CTE would look something like this:
;WITH RecursiveCTE AS
(
-- anchor query; get the CEO
SELECT EmployeeID, FirstName, LastName, Title, 1 AS 'Level', ReportsTo
FROM dbo.Employees
WHERE ReportsTo IS NULL
UNION ALL
-- recursive part; select next Employees that have ReportsTo -> cte.EmployeeID
SELECT
e.EmployeeID, e.FirstName, e.LastName, e.Title,
cte.Level + 1 AS 'Level', e.ReportsTo
FROM
dbo.Employees e
INNER JOIN
RecursiveCTE cte ON e.ReportsTo = cte.EmployeeID
)
SELECT *
FROM RecursiveCTE
ORDER BY Level, LastName
I don't know if you can translate your sample to a proper recursive CTE - but that's basically the gist of it: anchor query, UNION ALL, recursive query