PostgreSQL JOIN data from 3 tables - postgresql

I'm new to PostgreSQL and trying to get a query written. I'm pretty sure it's easy for someone who knows what they are doing - I just don't! :)
Basically I have three tables. In the first, I store details about patients. In the second, I store a reference to each image of them. In the third, I store the link to the file path for the image. I didn't design the database, so I'm not sure why the image files table is separated, but it is.
What I want to be able to do is select data from the first table, joining in data from a second then third table so I end up with the name & file path in the result.
So the basic structure is:
Table 1:
person_id | name
Table 2:
person_id | image_id
Table 3:
image_id | `path filename`
What I want to do is in one query, grab the person's 'name' and the image 'path filename'.
I'm happy with a "template" style answer with the join I need. I don't need it to be written in actual code. (i.e. I'm thinking you can just write me an answer that says SELECT table1.name, table3.pathfilename FROM JOIN ... etc...).

Something like:
select t1.name, t2.image_id, t3.path
from table1 t1
inner join table2 t2 on t1.person_id = t2.person_id
inner join table3 t3 on t2.image_id=t3.image_id

Maybe the following is what you are looking for:
SELECT name, pathfilename
FROM table1
NATURAL JOIN table2
NATURAL JOIN table3
WHERE name = 'John';

Related

How to insert new rows only on tables without Primary or Foreign Keys?

Scenario: I have two tables. Table A and Table B, both have the same exact columns. My task is to create a master table. I need to ensure no duplicates are in the master table unless it is a new record.
problem: Whoever built the tables did not assign a Primary Key to the table.
Attempts: I attempted running an INSERT INTO WHERE NOT EXISTS query (below as an example not the actual query I ran)
Question: the portion of the query below WHERE t2.id = t1.id confuses me, my table has a multitude of columns, there is no id column like I said it has no PRIMARY key to anchor the match, so, in a scenario where all I have are values without primary keys, how can I append only new records? Also, perhaps I am going about this the wrong way but are there any other functions or options through TSQL worth considering? Maybe not an INSERT INTO statement or perhaps something else? My SQL skills aren't yet that advance so I am not asking for a solution but perhaps ideas or other methods worth considering? Any ideas are welcome.
INSERT INTO TABLE_2
(id, name)
SELECT t1.id,
t1.name
FROM TABLE_1 t1
WHERE NOT EXISTS(SELECT id
FROM TABLE_2 t2
WHERE t2.id = t1.id)
If I understand your question correctly, you would need to amend the SQL sample you posted by changing the condition t2.id = t1.id to whatever columns you do have.
Say your 2 tables have name and brand columns and you don't want duplicates, just change the sample to:
WHERE t2.name = t1.name
AND t2.brand = t1.brand
This will ensure you don't insert and rows in table 2 from table 1 which are duplicates. You would have to make sure the where condition contains all columns (you said the table schemas are identical).
Also, the above code sample copies everything into table 2 - but you said you want a master table - so you'd have to change it to insert into the master table, not table 2.

Select multiple column with out code duplication while joining two table #active record # rails 2.3

Let us consider two tables
table1 - name,id,publisher_name,exp_date
table2-book_id,price,discount,last_date
I have to retrieve the name, id,publisher_name from table1 and price, last_date from table2
I wrote a code in active record rails 2
Table1.find(:all,:select=>"table1.name,table1.publisher_name,table1.id,table2.last_date,table2.price",:joins=>"LEFT OUTER JOIN table1s on table1s.id= table2s.book_id")
in this code by selecting multiple column name we need write that table name repeatedly,
need a simple code to avoid this problem
if the selected columns are not present in both tables you don't need to write the tablename as a prefix. You also don't need to name the table2 in front of "book_id". You only need them if the column-names are ambigious.
Table1.find( :all, :select=> "name, publisher_name, id, last_date, price", :joins => "LEFT OUTER JOIN table1s on table1s.id = book_id")

Postgres subquery has access to column in a higher level table. Is this a bug? or a feature I don't understand?

I don't understand why the following doesn't fail. How does the subquery have access to a column from a different table at the higher level?
drop table if exists temp_a;
create temp table temp_a as
(
select 1 as col_a
);
drop table if exists temp_b;
create temp table temp_b as
(
select 2 as col_b
);
select col_a from temp_a where col_a in (select col_a from temp_b);
/*why doesn't this fail?*/
The following fail, as I would expect them to.
select col_a from temp_b;
/*ERROR: column "col_a" does not exist*/
select * from temp_a cross join (select col_a from temp_b) as sq;
/*ERROR: column "col_a" does not exist
*HINT: There is a column named "col_a" in table "temp_a", but it cannot be referenced from this part of the query.*/
I know about the LATERAL keyword (link, link) but I'm not using LATERAL here. Also, this query succeeds even in pre-9.3 versions of Postgres (when the LATERAL keyword was introduced.)
Here's a sqlfiddle: http://sqlfiddle.com/#!10/09f62/5/0
Thank you for any insights.
Although this feature might be confusing, without it, several types of queries would be more difficult, slower, or impossible to write in sql. This feature is called a "correlated subquery" and the correlation can serve a similar function as a join.
For example: Consider this statement
select first_name, last_name from users u
where exists (select * from orders o where o.user_id=u.user_id)
Now this query will get the names of all the users who have ever placed an order. Now, I know, you can get that info using a join to the orders table, but you'd also have to use a "distinct", which would internally require a sort and would likely perform a tad worse than this query. You could also produce a similar query with a group by.
Here's a better example that's pretty practical, and not just for performance reasons. Suppose you want to delete all users who have no orders and no tickets.
delete from users u where
not exists (select * from orders o where o.user_d = u.user_id)
and not exists (select * from tickets t where t.user_id=u.ticket_id)
One very important thing to note is that you should fully qualify or alias your table names when doing this or you might wind up with a typo that completely messes up the query and silently "just works" while returning bad data.
The following is an example of what NOT to do.
select * from users
where exists (select * from product where last_updated_by=user_id)
This looks just fine until you look at the tables and realize that the table "product" has no "last_updated_by" field and the user table does, which returns the wrong data. Add the alias and the query will fail because no "last_updated_by" column exists in product.
I hope this has given you some examples that show you how to use this feature. I use them all the time in update and delete statements (as well as in selects-- but I find an absolute need for them in updates and deletes often)

a dual variable not in statement?

I have the need to look at two tables that share two variables and get a list of the data from one table that does not have matching data in the other table. Example:
Table A
xName
Date
Place
xAmount
Table B
yName
Date
Place
yAmount
I need to be able to write a query that will check Table A and find entries that have no corresponding entry in Table B. If it was a one variable issue I could use not in statement but I can't think of a way to do that with two variables. A left join also does not appear like you could do it. Since looking at it by a specific date or place name would not work since we are talking about thousands of dates and hundreds of place names.
Thanks in advance to anyone who can help out.
SELECT TableA.Date,
TableA.Place,
TableA.xName,
TableA.xAmount,
TableB.yName,
TableB.yAmount
FROM TableA
LEFT OUTER JOIN TableB
ON TableA.Date = TableB.Date
AND TableA.Place = TableB.Place
WHERE TableB.yName IS NULL
OR TableB.yAmount IS NULL
SELECT * FROM A WHERE NOT EXISTS
(SELECT 1 FROM B
WHERE A.xName = B.yName AND A.Date = B.Date AND A.Place = B.Place AND A.xAmount = B.yAmount)
in ORACLE:
select xName , xAmount from tableA
MINUS
select yName , yAmount from tableB

Finding duplicates between two tables

I've got two SQL2008 tables, one is a "Import" table containing new data and the other a "Destination" table with the live data. Both tables are similar but not identical (there's more columns in the Destination table updated by a CRM system), but both tables have three "phone number" fields - Tel1, Tel2 and Tel3. I need to remove all records from the Import table where any of the phone numbers already exist in the destination table.
I've tried knocking together a simple query (just a SELECT to test with just now):
select t2.account_id
from ImportData t2, Destination t1
where
(t2.Tel1!='' AND (t2.Tel1 IN (t1.Tel1,t1.Tel2,t1.Tel3)))
or
(t2.Tel2!='' AND (t2.Tel2 IN (t1.Tel1,t1.Tel2,t1.Tel3)))
or
(t2.Tel3!='' AND (t2.Tel3 IN (t1.Tel1,t1.Tel2,t1.Tel3)))
... but I'm aware this is almost certainly Not The Way To Do Things, especially as it's very slow. Can anyone point me in the right direction?
this query requires a little more that this information. If You want to write it in the efficient way we need to know whether there is more duplicates each load or more new records. I assume that account_id is the primary key and has a clustered index.
I would use the temporary table approach that is create a normalized table #r with an index on phone_no and account_id like
SELECT Phone, Account into #tmp
FROM
(SELECT account_id, tel1, tel2, tel3
FROM destination) p
UNPIVOT
(Phone FOR Account IN
(Tel1, tel2, tel3)
)AS unpvt;
create unclustered index on this table with the first column on the phone number and the second part the account number. You can't escape one full table scan so I assume You can scan the import(probably smaller). then just join with this table and use the not exists qualifier as explained. Then of course drop the table after the processing
luke
I am not sure on the perforamance of this query, but since I made the effort of writing it I will post it anyway...
;with aaa(tel)
as
(
select Tel1
from Destination
union
select Tel2
from Destination
union
select Tel3
from Destination
)
,bbb(tel, id)
as
(
select Tel1, account_id
from ImportData
union
select Tel2, account_id
from ImportData
union
select Tel3, account_id
from ImportData
)
select distinct b.id
from bbb b
where b.tel in
(
select a.tel
from aaa a
intersect
select b2.tel
from bbb b2
)
Exists will short-circuit the query and not do a full traversal of the table like a join. You could refactor the where clause as well, if this still doesn't perform the way you want.
SELECT *
FROM ImportData t2
WHERE NOT EXISTS (
select 1
from Destination t1
where (t2.Tel1!='' AND (t2.Tel1 IN (t1.Tel1,t1.Tel2,t1.Tel3)))
or
(t2.Tel2!='' AND (t2.Tel2 IN (t1.Tel1,t1.Tel2,t1.Tel3)))
or
(t2.Tel3!='' AND (t2.Tel3 IN (t1.Tel1,t1.Tel2,t1.Tel3)))
)