Can someone explain multiple left join logic?
For example i have 3 tables: Company, company_text, company_rank.
Company has 4 records (id's: 1,2,3,4), company_text has 4 records with company names (1-a,2-b,3-c, 4-d), company_rank has following (1-1st, 2-2nd, 3-3rd). Note that company_rank table is not having 4th record. Now i want records of all companies using LEFT join. In case if there is no rank, display as 'zzzz' and sort in the descending order of rank and when rank is null sort by descending order of id.
select * from company
LEFT JOIN company_text ON company.id = company_text.id
LEFT JOIN company_rank ON company_text.id = company_rank.id
order by isnull(company_rank.rank,'zzzz'), rank desc
Will this work?
Basically i am trying to understand how LEFT JOIN works if there are many left joins? This doc. has good info on joins: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html but it don't have information on how multiple LEFT JOINS work? In case if multiple joins are present, how records data will be pulled?
Related
Is there a way to average a column only on a distinct of another column when the query is already grouped for another purpose without using a subquery? I know it can be done through subqueries, but trying to avoid restructuring an old query unless it is absolutely necessary.
The existing query, while complex, has more or less the same structure as the example below. As you can see, a library has any number of books, a book has any number of chapters, and a chapter has any number of paragraphs while the query returns the total numbers of books and paragraphs for each library.
SELECT libraries.name,
COUNT(DISTINCT books.id) AS num_books,
COUNT(paragraphs.id) AS num_paragraphs
FROM libraries
LEFT JOIN books ON books.library_id = libraries.id
LEFT JOIN chapters ON chapters.book_id = books.id
LEFT JOIN paragraphs ON paragraphs.chapter_id = chapters.id
GROUP BY libraries.name
Now suppose the table books has a column publish_year and I want the average year books in the library were published. Obviously I can't simply add AVERAGE(books.publish_year) since books with more chapters and paragraphs would skew the average.
Is there a good way of averaging books.publish_year based upon distinct books.id again without restructuring the query or is restructuring the query inevitable?
A window function before joining
select
l.name,
count(distinct b.id) as num_books,
count(p.id) as num_paragraphs,
min(year_avg) as year_avg
from
libraries l
left join (
select *, avg(publish_year) over(partition by library_id) as year_avg
from books
) b on b.library_id = l.id
left join chapters c on c.book_id = b.id
left join paragraphs p on p.chapter_id = c.id
group by l.name
I have two postgres tables where one column listing a city name matches. I'm trying to create a view of some records which I'm displaying on a map via WMS on my GeoServer.
I need to select only records from table1 of 100k records that has a city name that matches those cities listed in table2 of 20 records.
To list everything I've tried would be a waste of your time. I've tried every join tutorial and example but, am perplexed why I can't get any success. I would really appreciate some direction.
Here's a latest query but, if this is the wrong approach just ignore since I have about 50 similar attempts.
SELECT t1.id,
t1.dba,
t1.prem_city,
t1.geom
t2.city_label
FROM schema1.table1 AS t1
LEFT JOIN schema2.table2 AS t2
ON t2.city_label = t1.prem_city;
Thanks for any help!
Your query seems correct, just a minor change - LEFT JOIN keeps all the records from the left table and only the matching record from the right one. If you want only those that appear in both - an INNER JOIN is required .
SELECT t1.id,
t1.dba,
t1.prem_city,
t1.geom,
t2.city_label
FROM schema1.table1 t1
JOIN schema2.table2 t2
ON t2.city_label = t1.prem_city;
Apology in advance for a long question, but doing this just for the sake of learning:
i'm new to SQL and researching on JOIN for now. I'm getting two different behaviors when using INNER and OUTER JOIN. What I know is, INNER JOIN gives an intersection kind of result while returning only common rows among tables, and (LEFT/RIGHT) OUTER JOIN is outputting what is common and remaining rows in LEFT or RIGHT tables, depending upon LEFT/RIGHT clause respectively.
While working with MS Training Kit and trying to solve this practice: "Practice 2: In this practice, you identify rows that appear in one table but have no matches in another. You are given a task to return the IDs of employees from the HR.Employees table who did not handle orders (in the Sales.Orders table) on February 12, 2008. Write three different solutions using the following: joins, subqueries, and set
operators. To verify the validity of your solution, you are supposed to return employeeIDs: 1, 2, 3, 5, 7, and 9."
I'm successful doing this with subqueries and set operators but with JOIN is returning something not expected. I've written the following query:
USE TSQL2012;
SELECT
E.empid
FROM
HR.Employees AS H
JOIN Sales.Orders AS O
ON H.empid = O.empid
AND O.orderdate = '20080212'
JOIN HR.Employees AS E
ON E.empid <> H.empid
ORDER BY
E.empid
;
I'm expecting results as: 1, 2, 3, 5, 7, and 9 (6 rows)
But what i'm getting is: 1,1,1,2,2,2,3,3,3,4,4,5,5,5,6,6,7,7,7,8,8,9,9,9 (24 rows)
I tried some videos but could not understand this side of INNER/OUTER JOIN. I'll be grateful if someone could help this side of JOIN, why is it so and what should I try to understand while working with JOIN.
you can also use left outer join to get not matching
*** The LEFT JOIN keyword returns all rows from the left table (table1), with the matching rows in the right table (table2). The result is NULL in the right side when there is no match.
SELECT
H.empid
FROM
HR.Employees AS H
LEFT OUTER JOIN Sales.Orders AS O
ON H.empid = O.empid
AND O.orderdate = '20080212'
WHERE O.empid IS NULL
Above script will return emp id who did not handle orders on specify date
here you can see all kind of join
Diagram taken from: http://dsin.wordpress.com/2013/03/16/sql-join-cheat-sheet/
adjust your query to be like this
USE TSQL2012;
SELECT
E.empid
FROM
HR.Employees AS H
JOIN Sales.Orders AS O
ON H.empid = O.empid
where O.orderdate = '2008-02-12' AND O.empid IN null
ORDER BY
E.empid
;
USE TSQL2012;
SELECT
distinct E.empid
FROM
HR.Employees AS H
JOIN Sales.Orders AS O
ON H.empid = O.empid
AND O.orderdate = '20080212'
JOIN HR.Employees AS E
ON E.empid <> H.empid
ORDER BY
E.empid
;
Primary things to always remind yourself when working with SQL JOINs:
INNER JOINs require a match in the join in order for result set rows produced prior to the INNER JOIN to remain in the result set. When no match is found for a row, the row is discarded from the result set.
For a row fed to an INNER JOIN that matches to ONLY one row, only one copy of that row fed to the result set is delivered.
For a row fed to an INNER JOIN that matches to multiple rows, the row will be delivered multiple times, once for each row match from the INNER JOIN table.
OUTER JOINs will not discard rows fed to them in the result set, whether or not the OUTER JOIN results in a match or not.
Just like INNER JOINs, if an OUTER JOIN matches to more than one row, it will increase the number of rows in the result set by duplicating rows equal to the number of rows matched from the OUTER JOIN table.
Ask yourself "if I get NO match on the JOIN, do I want the row discarded or not?" If the answer is NO, use an OUTER JOIN. If the answer is YES, use an INNER JOIN.
If you don't need to reference any of the columns from a JOIN table, don't perform a JOIN at all. Instead, use a WHERE EXISTS, WHERE NOT EXISTS, WHERE IN, WHERE NOT IN, etc. or similar, depending on your database engine in use. Don't rely on the database engine to be smart enough to discard unreferenced columns resulting from JOINs from the result set. Some databases may be smart enough to do that, some not. There's no reason to pull columns into a result set only to not reference them. Doing so increases chance of reduced performance.
Your JOIN of:
JOIN HR.Employees AS E
ON E.empid <> H.empid
...is matching to all Employees rows with a DIFFERENT EMPID to all rows fed to that join. Use of NOT EQUAL on an INNER JOIN is a very rare thing to do or need, especially if the JOIN predicate is testing only ONE condition. That is why your getting duplicate rows in the result set.
On DB2, we could perform an EXCEPTION JOIN to accomplish that using a JOIN alone. Normally, on DB2, I would use a WHERE NOT EXISTS for that. On SQL Server you could do a JOIN to a query where the query set is all employees without orders in SALES.ORDERS on the specified date, but I don't know if that violates the rules of your tutorial.
Naveen posted the solution it appears your tutorial is looking for!
I've been using SQLPLUS lately and one of my tasks was to display a set of values from two tables (stocks, orderitems). I have done this part, but I am stuck on the last part of the question which states: "including the stocks that no order has been placed on them so far".
Here is the statement:
`select Stocks.StockNo, Stocks.Description, OrderItems.QtyOrd
from Stocks INNER JOIN OrderItems
ON Stocks.StockNo = OrderItems.StockNo;`
and I have gotten the correct results for this part, but the second part is eluding me, as the curernt statement doesn't display the 0 values for QtyOrd.
Any help would be appreciated.
You likely want to use a LEFT OUTER JOIN otherwise the INNER JOIN will exclude Stocks which don't have any Orders. You might also consider grouping by Stock, in order to SUM the overall quantities for each stock?
SELECT Stocks.StockNo, Stocks.Description, SUM(OrderItems.QtyOrd) AS QtyOrd
FROM Stocks
LEFT OUTER JOIN OrderItems
ON Stocks.StockNo = OrderItems.StockNo
GROUP BY Stocks.StockNo, Stocks.Description;
I have the following SQL query but I got a problem:
When I execute it I got two of the same serial numbers from the "sn" column in the "products" table.
SELECT specifications.productname,
products.sn, specifications.year,
lendings.lending_date
FROM products
INNER JOIN lendings ON products.id = lendings.product_id
INNER JOIN specifications ON products.sn LIKE CONCAT(\'%\', specifications.sn, \'%\') OR products.type LIKE CONCAT(\'%\', specifications.type, \'%\')
WHERE lendings.user_id = ?
EDIT:
lendings table:
user_id product_id
1 1
1 2
2 3
Specifications table:
productname year type sn
name1 2012 1 1234
name2 2011 2 4321
name3 2010 3 3241
products table:
id sn
1 AAAAAAAA1234
2 BBBBBBBB4321
3 CCCCCCCC3241
EDIT2:
SELECT products.id,
specifications.productname,
products.sn,
specifications.year,
lendings.lending_date
FROM products
INNER JOIN lendings ON products.id = lendings.product_id
INNER JOIN specifications ON products2.sn LIKE CONCAT(specifications.sn, \'%\') OR products.type = specifications.type
WHERE lendings.user_id = ?
One of your Join on conditions is too slack then
for instance two lendings records pointing to the same product.
Usually, that means you don't have all the necesary join columns present in one of your joins and you are getting a cartesian product. In database terms, this means you are joining to a table and expected to join to a single row, but multiple rows match the criteria, so you are actually joining to more than one row. When this happens, you will get the same row multiple times (product row in your example) in your result.
It would have been better if you posted some test data so this scenario could be confirmed, but since you didn't, I would recommend checking each of your joins to make sure you are not getting multiple rows back for the given products row.
One part of your query I find particularly suspect is this join:
INNER JOIN specifications ON products.sn LIKE CONCAT(\'%\', specifications.sn, \'%\') OR products.type LIKE CONCAT(\'%\', specifications.type, \'%\')
You're joining using a LIKE operator, which seems to have a high chance of getting multiple rows.