I'm using PostgreSQL 9.5. I'd like to have the column-merging functionality of USING in a query where not all of the columns that I'm using for the join are named the same. For example:
SELECT
*
FROM table_a a
INNER JOIN table_b b USING(shared_id) AND a.foo = b.bar
The above code doesn't work. Is there something I can write to get this effect? Or do I need to do ON a.shared_id = b.shared_id AND a.foo = b.bar?
You CAN'T use both
http://www.postgresql.org/docs/9.5/static/queries-table-expressions.html
The join condition is specified in the ON or USING clause, or implicitly by the word NATURAL. The join condition determines which rows from the two source tables are considered to "match", as explained in detail below.
Focus on the or part. ON or USING
Related
I want to do a join with another table. I followed the tutorial on the site and the my code compiles but it's not performing the join and instead just selects the first table.
SELECT
"table1.col1"
"table1.col2"
"table1.col3"
FROM
"table1"
JOIN "table2" ON "table1"."col1" = "table2"."col1"
LIMIT
1
It is only returning the data from table1 and not concatenating the columns where the condition for table1 and table2 is met.
I execute the query using the following code:
Entity::find()
.from_raw_sql(Statement::from_string(DatabaseBackend::Postgres, query.to_owned()))
.all(&self.connection)
.await?
That returns a Vec<Model>. Is this the correct way? Also, how can I build a SQL statement using an Entity as the base which looks like SELECT * from "table1".
After 'SELECT' (and before 'FROM') you are specifying which columns
to include in the output,
and you are selecting only three columns from table1 in your code.
Add the columns you want to include from table2 here, and you may get
the results you want.
Is it possible in Postgres to have an optional join?
My use case is something like
select ...
from a
inner join b using (b_id)
where b.type in (...)
a is a very large reporting table. b is used to filter a, BUT the most common use case is that we will want all b.types, and therefore all the b records in the join. In other words, in most cases we don't want to filter by b at all, and would not need the join in that case, but the filtering optionality still needs to be there in cases when the user wants to filter by type.
So is it possible to invoke the join optionally, and save the join effort in cases when we just want all of a?
If not, what's my next best option? IF ... THEN or CTE with a union of separate queries?
If you don't need any of b's columns, there is no need to JOIN table b, You can filter by using EXISTS(SELECT .. FROM b WHERE ...).
If you want to conditionally exclude a part of the WHERE clause, you could use the following construct: (the ignore_b boolean will function as an on/off switch)
-- $ignore_b is a Boolean flag
-- when True, the optimiser will ignore the exists(...)
SELECT ...
FROM a
WHERE ( $ignore_b OR EXISTS (
SELECT *
FROM b
WHERE b.b_id = a.some_id
AND b.type in (1,2,3,4,5)
)
);
In our example, you are still filtering based on b, based on whether a row with that b_id exists in b in the first place.
Postgresql will remove unneeded joins under very specific circumstances. You write the join as a left join, so that no rows of A can be removed due to the absence of corresponding rows in B. The column B.b_id is a declared unique or primary key, so that no rows of A can be duplicated due to duplicate matches in B. And of course, no column of B can referenced in the query (except the reference to the key column in the left join condition).
In those cases, you can just always write the LEFT JOIN, and PostgreSQL will figure out that it can skip it.
You can argue that if you have a declared foreign key constraint on the join condition, then you shouldn't need the JOIN to be a LEFT JOIN in order to implement this optimization. I think that that argument is correct, but PostgreSQL does not implement it that way.
I would just do it programatically. If you are already programmatically adding references to B in the WHERE clause, you should be able to do it for the join as well.
Perhaps I'm approaching this all wrong, in which case feel free to point out a better way to solve the overall question, which "How do I use an intermediate table for future queries?"
Let's say I've got tables foo and bar, which join on some baz_id, and I want to use combine this into an intermediate table to be fed into upcoming queries. I know of the WITH .. AS (...) statement, but am running into problems as such:
WITH foobar AS (
SELECT *
FROM foo
INNER JOIN bar ON bar.baz_id = foo.baz_id
)
SELECT
baz_id
-- some other things as well
FROM
foobar
The issue is that (Postgres 9.4) tells me baz_id is ambiguous. I understand this happens because SELECT * includes all the columns in both tables, so baz_id shows up twice; but I'm not sure how to get around it. I was hoping to avoid copying the column names out individually, like
SELECT
foo.var1, foo.var2, foo.var3, ...
bar.other1, bar.other2, bar.other3, ...
FROM foo INNER JOIN bar ...
because there are hundreds of columns in these tables.
Is there some way around this I'm missing, or some altogether different way to approach the question at hand?
WITH foobar AS (
SELECT *
FROM foo
INNER JOIN bar USING(baz_id)
)
SELECT
baz_id
-- some other things as well
FROM
foobar
It leaves only one instance of the baz_id column in the select list.
From the documentation:
The USING clause is a shorthand that allows you to take advantage of the specific situation where both sides of the join use the same name for the joining column(s). It takes a comma-separated list of the shared column names and forms a join condition that includes an equality comparison for each one. For example, joining T1 and T2 with USING (a, b) produces the join condition ON T1.a = T2.a AND T1.b = T2.b.
Furthermore, the output of JOIN USING suppresses redundant columns: there is no need to print both of the matched columns, since they must have equal values. While JOIN ON produces all columns from T1 followed by all columns from T2, JOIN USING produces one output column for each of the listed column pairs (in the listed order), followed by any remaining columns from T1, followed by any remaining columns from T2.
I have this query here which returns an error because of too many rows returned:
UPDATE tmp_rsl2 SET comm_percent=( SELECT c2.comm_percent
FROM tmp_rsl2 t1
INNER JOIN gn_salesperson g1 ON t1.sales_person=g1.sales_person
INNER JOIN comm_schema c1 ON g1.comm_schema=c1.comm_schema
INNER JOIN comm_schema_dt c2 ON c1.comm_schema_id=c2.comm_schema_id AND (t1.balance_amount::numeric <= (COALESCE(c2.value_amount,0)) );`
Basically for each row of the comm_percent column, I want to update all of them using the subquery SELECT statement. I imagine using a FOR loop or something but I'd like to hear ideas or to know a proper way to do this.
The error TOO_MANY_ROWS is about assigning a value to a variable, that can only take '1' (one) value, whereas the SELECT query is returning more than one.
Without a reference schema, its difficult to give an SQL that'd work (not to say that the issue lies with the Schema), but you need to ensure that the value assigned to comm_percent from the SELECT statement returns only 1 row. A very blind attempt at how it 'might' work in your case (given below), but again without knowing the schema its difficult to gauge whether it'd work.
UPDATE tmp_rsl2
SET comm_percent = c2.comm_percent
FROM gn_salesperson g1 ON
INNER JOIN comm_schema c1 ON g1.comm_schema = c1.comm_schema
INNER JOIN comm_schema_dt c2 ON c1.comm_schema_id = c2.comm_schema_id
AND (tmp_rsl2.balance_amount::NUMERIC <= (COALESCE(c2.value_amount, 0)))
WHERE tmp_rsl2.sales_person = g1.sales_person
UPDATE
As per below comments, have given an unrelated SQLFiddle example that should give an idea of how to perform an UPDATE of all rows of a table looking up corresponding values from another table.
I want to join two tables together and add additional information from two other tables to the same columns in both queried tables. I've come up with the below code, which works, but I don't feel comfortable about having to add another JOIN clause for each table, as it would make the query substantially long if I wanted to join/add more things.
Is there a way to combine it, so that I can join additional tables only once (just use S and E aliases every time)?
SELECT
J.JobId,
J.StandardJobId,
S.JobName,
J.EngineerId,
E.EngineerName,
JF.JobId AS FollowUpJobId,
JF.StandardJobId AS FollowUpStandardJobId,
SF.JobName AS FollowUpJobName,
JF.EngineerId AS FollowUpEngineerId,
EF.EngineerName AS FollowUpEngineerName
FROM
Jobs J
INNER JOIN
Jobs JF
ON
J.FollowUpJobId = JF.JobId
INNER JOIN
StandardJobs S
ON
J.StandardJobId = S.StandardJobId
INNER JOIN
Engineers E
ON
E.EngineerId = J.EngineerId
INNER JOIN
StandardJobs SF
ON
SF.StandardJobId = JF.StandardJobId
INNER JOIN
Engineers EF
ON
EF.EngineerId = JF.EngineerId
One approach would be to use a Common Table Expression (CTE) - something like:
with cte as
(SELECT J.JobId,
J.StandardJobId,
S.JobName,
J.EngineerId,
E.EngineerName,
J.FollowUpJobId
FROM Jobs J
INNER JOIN StandardJobs S ON J.StandardJobId = S.StandardJobId
INNER JOIN Engineers E ON E.EngineerId = J.EngineerId)
SELECT O.*,
F.StandardJobId AS FollowUpStandardJobId,
F.JobName AS FollowUpJobName,
F.EngineerId AS FollowUpEngineerId,
F.EngineerName AS FollowUpEngineerName
FROM CTE AS O
JOIN CTE AS F ON O.FollowUpJobId = F.JobId
You can sort of do this with either a CTE (Common Table Expressions, the WITH clause) or a View:
;WITH Jobs_Extended As
(
SELECT j.*,
s.JobName,
E.EngineerName
FROM Jobs As j
JOIN StandardJobs As s ON s.StandardJobId = j.StandardJobId
JOIN Engineer As e ON e.EngineerId = j.EngineerId
)
SELECT
J.JobId,
J.StandardJobId,
J.JobName,
J.EngineerId,
J.EngineerName,
JF.JobId AS FollowUpJobId,
JF.StandardJobId AS FollowUpStandardJobId,
JF.JobName AS FollowUpJobName,
JF.EngineerId AS FollowUpEngineerId,
JF.EngineerName AS FollowUpEngineerName
FROM Jobs_Extended J
JOIN Jobs_Extended JF ON J.FollowUpJobId = JF.JobId
In this example the CTE Jobs_Extended becomes a defined alias for the relationship between the Jobs, Engineers and StandardJobs tables. Then once defined, you can use it multiple times in the query without having to redefine those interior relations.
You can do the same thing by change the WITH to a View, which will make the defined alias permannet in your database.
No, you cannot avoid JOINing related tables each time a separate reference is needed. The issue is that you are not working with the tables in a general sense but instead working with the specific rows of each table, even more specifically, just those rows that match the JOIN and WHERE conditions.
There is no way to specify the references to either StandardJobs or Engineers only once because you are needing to work with two rows from each table at the same time, at least in the given example.
However, depending on which direction you are wanting to go with "additional tables" (more references to Jobs or more lookups like StandardJobs and Engineers for the given 2 references of Jobs), the CTE construct shown by Mark is the probably the easiest / best way to abstract it. I posted this answer mainly to explain the issue at hand.