How to take an intermediate data for the complex sql query. Postgresql - postgresql

I have some complex queries to the postgresql which takes data from several tables joined each other with outer left join operators.
I need to test these queries so I need a fixtures for the tests contain only data I need, not whole tables data.
How could I see the intermediate results for these join subqueries to use it as a fixtures?
For example, I have tables A, B and C and query
SELECT A.column
FROM A
LEFT JOIN B ON A.b_id = B.id
LEFT JOIN C ON A.c_id = C.a_id
How could I take a result as "From table a: {part of A table taking part on query}, From table B {part of B table taking part on query}" etc, when parts of tables shows needed data or something like this. Is there any existing tool or method for it?
Unfortunately, EXPLAIN and ANALYSE shows only statistics and benchmarks, not data.

maybe you mean
SELECT A.*
FROM A
LEFT JOIN B ON A.b_id = B.id
LEFT JOIN C ON A.c_id = C.a_id
limit 10
to see what's happening in A from the join?
Or perhaps
select concat('from table a', a.col1, a.col2...) ,
concat('from table b', b.col1, b.col2...)
from ...
String functions such as concat: http://www.postgresql.org/docs/9.1/static/functions-string.html
also worth looking into http://www.postgresql.org/docs/9.1/static/functions-array.html at array_append()

Related

Paging in Postgres on a Left Join

Summary:
I have data in a db that needs to be displayed client side. Up until this point it wasn't paged but now the data has grown to a point that it's noticeably slowing the connection down. So I want to page it.
Setup:
Client side I'm using DataTables
Server side I'm using F#
The DB is postgres
Problem:
I have 3 tables, Tables [A , B, C]. Table A has a one to many relationship with tables B and C. So when I do a query like
select * from A left join B on a.id = b.tableidb left join C on a.id = c.tableidc
I would get 7 rows, which is fine. This is all the data I actually want. The problem really comes when we try and page
select * from A left join B on a.id = b.tableidb left join C on a.id = c.tableidc limit 5 offset 0
As you can see, it does in fact bring back only 5 rows. However, because of the left joins, we don't get the full set of data.
Expected Solution
What I'd like to say is something to the effect of "Give me 5 rows from table A at offset 0, then left join on tables B and C"
Is there a way to do this in postgres?
You can use subselects in the FROM clause.
All you have to do is limit the number of rows there:
SELECT *
FROM (SELECT * FROM A
ORDER BY a.id
LIMIT 5) AS al
LEFT JOIN b ON al.id = b.tableidb
LEFT JOIN c on al.id = c.tableidc;
Notes:
Using LIMIT without ORDER BY does not make much sense.
If you consider paging, don't use LIMIT and OFFSET.
Rather, remember the last a.id you selected the first time and query WHERE a.id > previous_a_id LIMIT 5.

INNER JOIN, LEFT/RIGHT OUTER JOIN

Apology in advance for a long question, but doing this just for the sake of learning:
i'm new to SQL and researching on JOIN for now. I'm getting two different behaviors when using INNER and OUTER JOIN. What I know is, INNER JOIN gives an intersection kind of result while returning only common rows among tables, and (LEFT/RIGHT) OUTER JOIN is outputting what is common and remaining rows in LEFT or RIGHT tables, depending upon LEFT/RIGHT clause respectively.
While working with MS Training Kit and trying to solve this practice: "Practice 2: In this practice, you identify rows that appear in one table but have no matches in another. You are given a task to return the IDs of employees from the HR.Employees table who did not handle orders (in the Sales.Orders table) on February 12, 2008. Write three different solutions using the following: joins, subqueries, and set
operators. To verify the validity of your solution, you are supposed to return employeeIDs: 1, 2, 3, 5, 7, and 9."
I'm successful doing this with subqueries and set operators but with JOIN is returning something not expected. I've written the following query:
USE TSQL2012;
SELECT
E.empid
FROM
HR.Employees AS H
JOIN Sales.Orders AS O
ON H.empid = O.empid
AND O.orderdate = '20080212'
JOIN HR.Employees AS E
ON E.empid <> H.empid
ORDER BY
E.empid
;
I'm expecting results as: 1, 2, 3, 5, 7, and 9 (6 rows)
But what i'm getting is: 1,1,1,2,2,2,3,3,3,4,4,5,5,5,6,6,7,7,7,8,8,9,9,9 (24 rows)
I tried some videos but could not understand this side of INNER/OUTER JOIN. I'll be grateful if someone could help this side of JOIN, why is it so and what should I try to understand while working with JOIN.
you can also use left outer join to get not matching
*** The LEFT JOIN keyword returns all rows from the left table (table1), with the matching rows in the right table (table2). The result is NULL in the right side when there is no match.
SELECT
H.empid
FROM
HR.Employees AS H
LEFT OUTER JOIN Sales.Orders AS O
ON H.empid = O.empid
AND O.orderdate = '20080212'
WHERE O.empid IS NULL
Above script will return emp id who did not handle orders on specify date
here you can see all kind of join
Diagram taken from: http://dsin.wordpress.com/2013/03/16/sql-join-cheat-sheet/
adjust your query to be like this
USE TSQL2012;
SELECT
E.empid
FROM
HR.Employees AS H
JOIN Sales.Orders AS O
ON H.empid = O.empid
where O.orderdate = '2008-02-12' AND O.empid IN null
ORDER BY
E.empid
;
USE TSQL2012;
SELECT
distinct E.empid
FROM
HR.Employees AS H
JOIN Sales.Orders AS O
ON H.empid = O.empid
AND O.orderdate = '20080212'
JOIN HR.Employees AS E
ON E.empid <> H.empid
ORDER BY
E.empid
;
Primary things to always remind yourself when working with SQL JOINs:
INNER JOINs require a match in the join in order for result set rows produced prior to the INNER JOIN to remain in the result set. When no match is found for a row, the row is discarded from the result set.
For a row fed to an INNER JOIN that matches to ONLY one row, only one copy of that row fed to the result set is delivered.
For a row fed to an INNER JOIN that matches to multiple rows, the row will be delivered multiple times, once for each row match from the INNER JOIN table.
OUTER JOINs will not discard rows fed to them in the result set, whether or not the OUTER JOIN results in a match or not.
Just like INNER JOINs, if an OUTER JOIN matches to more than one row, it will increase the number of rows in the result set by duplicating rows equal to the number of rows matched from the OUTER JOIN table.
Ask yourself "if I get NO match on the JOIN, do I want the row discarded or not?" If the answer is NO, use an OUTER JOIN. If the answer is YES, use an INNER JOIN.
If you don't need to reference any of the columns from a JOIN table, don't perform a JOIN at all. Instead, use a WHERE EXISTS, WHERE NOT EXISTS, WHERE IN, WHERE NOT IN, etc. or similar, depending on your database engine in use. Don't rely on the database engine to be smart enough to discard unreferenced columns resulting from JOINs from the result set. Some databases may be smart enough to do that, some not. There's no reason to pull columns into a result set only to not reference them. Doing so increases chance of reduced performance.
Your JOIN of:
JOIN HR.Employees AS E
ON E.empid <> H.empid
...is matching to all Employees rows with a DIFFERENT EMPID to all rows fed to that join. Use of NOT EQUAL on an INNER JOIN is a very rare thing to do or need, especially if the JOIN predicate is testing only ONE condition. That is why your getting duplicate rows in the result set.
On DB2, we could perform an EXCEPTION JOIN to accomplish that using a JOIN alone. Normally, on DB2, I would use a WHERE NOT EXISTS for that. On SQL Server you could do a JOIN to a query where the query set is all employees without orders in SALES.ORDERS on the specified date, but I don't know if that violates the rules of your tutorial.
Naveen posted the solution it appears your tutorial is looking for!

Can I apply predicates to the same columns against multiple tables in a JOIN only once?

I want to join two tables together and add additional information from two other tables to the same columns in both queried tables. I've come up with the below code, which works, but I don't feel comfortable about having to add another JOIN clause for each table, as it would make the query substantially long if I wanted to join/add more things.
Is there a way to combine it, so that I can join additional tables only once (just use S and E aliases every time)?
SELECT
J.JobId,
J.StandardJobId,
S.JobName,
J.EngineerId,
E.EngineerName,
JF.JobId AS FollowUpJobId,
JF.StandardJobId AS FollowUpStandardJobId,
SF.JobName AS FollowUpJobName,
JF.EngineerId AS FollowUpEngineerId,
EF.EngineerName AS FollowUpEngineerName
FROM
Jobs J
INNER JOIN
Jobs JF
ON
J.FollowUpJobId = JF.JobId
INNER JOIN
StandardJobs S
ON
J.StandardJobId = S.StandardJobId
INNER JOIN
Engineers E
ON
E.EngineerId = J.EngineerId
INNER JOIN
StandardJobs SF
ON
SF.StandardJobId = JF.StandardJobId
INNER JOIN
Engineers EF
ON
EF.EngineerId = JF.EngineerId
One approach would be to use a Common Table Expression (CTE) - something like:
with cte as
(SELECT J.JobId,
J.StandardJobId,
S.JobName,
J.EngineerId,
E.EngineerName,
J.FollowUpJobId
FROM Jobs J
INNER JOIN StandardJobs S ON J.StandardJobId = S.StandardJobId
INNER JOIN Engineers E ON E.EngineerId = J.EngineerId)
SELECT O.*,
F.StandardJobId AS FollowUpStandardJobId,
F.JobName AS FollowUpJobName,
F.EngineerId AS FollowUpEngineerId,
F.EngineerName AS FollowUpEngineerName
FROM CTE AS O
JOIN CTE AS F ON O.FollowUpJobId = F.JobId
You can sort of do this with either a CTE (Common Table Expressions, the WITH clause) or a View:
;WITH Jobs_Extended As
(
SELECT j.*,
s.JobName,
E.EngineerName
FROM Jobs As j
JOIN StandardJobs As s ON s.StandardJobId = j.StandardJobId
JOIN Engineer As e ON e.EngineerId = j.EngineerId
)
SELECT
J.JobId,
J.StandardJobId,
J.JobName,
J.EngineerId,
J.EngineerName,
JF.JobId AS FollowUpJobId,
JF.StandardJobId AS FollowUpStandardJobId,
JF.JobName AS FollowUpJobName,
JF.EngineerId AS FollowUpEngineerId,
JF.EngineerName AS FollowUpEngineerName
FROM Jobs_Extended J
JOIN Jobs_Extended JF ON J.FollowUpJobId = JF.JobId
In this example the CTE Jobs_Extended becomes a defined alias for the relationship between the Jobs, Engineers and StandardJobs tables. Then once defined, you can use it multiple times in the query without having to redefine those interior relations.
You can do the same thing by change the WITH to a View, which will make the defined alias permannet in your database.
No, you cannot avoid JOINing related tables each time a separate reference is needed. The issue is that you are not working with the tables in a general sense but instead working with the specific rows of each table, even more specifically, just those rows that match the JOIN and WHERE conditions.
There is no way to specify the references to either StandardJobs or Engineers only once because you are needing to work with two rows from each table at the same time, at least in the given example.
However, depending on which direction you are wanting to go with "additional tables" (more references to Jobs or more lookups like StandardJobs and Engineers for the given 2 references of Jobs), the CTE construct shown by Mark is the probably the easiest / best way to abstract it. I posted this answer mainly to explain the issue at hand.

Intersect of Select Statements based on a particular column

I have a Q about INTERSECT clause between two select statements in Sql server 2008.
Select 1 a,b,c ..... INTERSECT Select 2 a,b,c....
Here, the datasets of the two queries should exactly match to return the common elements.
But, I want only column a of both select statements to match.
If the values of column a in both the queries have same values, the entire row should appear in the result set.
Can i Do that and How ??
Thanks,
Marcus..
The best thing to do is to look at the queries itself. DO they need an INTERSECT, of is it possible to make a join with it
for example.
An INTERSECT looks like this
select columnA
from tableA
INTERSECT
select columnAreference
from tableB
Your result would have all columns that are in BOTH tables.. so a join would be more usefull
select columnA
from tableA a
inner join tableB b
on b.columnAReference = a.columnA
If you look into the execution plan you'll see that the INTERSECT will do a left semi join and the inner join will do a, like expected, an inner join. A left semi join isn't something you can tell the query optimizer to do, BUT IT IS FASTER!!!! A left semi join will only return 1 row from the left table, where a normal join will return them all. In this particular case it will be faster.
So an INTERSECT isn't a bad thing which should be eliminated with an INNER JOIN construction, sometimes it will perform even better.
However, to give you the best answer, i will need some more details about your query :)
select * from table1 t1 inner join Table2 t2
on t1.col1=t2.col1

JOIN that doesn't exclude all records if one side is null

I have a fairly conventional set of order entry tables divided by:
Orders
OrdersRows
OrdersRowsOptions
The record in OrderRowOptions is not created unless needed. When I create a set of joins like
select * from orders o
inner join OrdersRows r on r.idOrder = o.idOrder
inner join ordersrowsoptions ro on ro.idOrderRow = r.idOrderRow
where r.idProduct = [foo]
My full resultset is blank if no ordersrowsoptions records exist for the given product.
what's the correct syntax to return records even if no records exist at one of the join clauses?
thx
select * from orders o
inner join OrdersRows r on r.idOrder = o.idOrder
left join ordersrowsoptions ro on ro.idOrderRow = r.idOrderRow
where r.idProduct = [foo]
Of course you should not use select * in any query but especially never when doing a join. The repeated fields are just wasting server and network resources.
Since you seem unfamiliar with left joins, you probably also need to understand the concepts in this:
http://wiki.lessthandot.com/index.php/WHERE_conditions_on_a_LEFT_JOIN
LEFT JOIN / RIGHT JOIN.
Edit: yes, the following answer, given earlier, is correct:
select * from orders o
inner join OrdersRows r on r.idOrder = o.idOrder
left join ordersrowsoptions ro on ro.idOrderRow = r.idOrderRow
where r.idProduct = [foo]
LEFT JOIN (or RIGHT JOIN) are probably what you are looking for, depending on which side of the join no rows may appear.
Interesting, do you want to get all orders that have that product in them? The other post is correct that you have to use LEFT or RIGHT OUTER JOINS. But if you want to get entire orders that have that product then you'd need a more complex where clause.