Hive query with with booleans evaluate to null - left-join

I want on those cases of a.email where there is no b.email mathc. So that b.email will be null in this case, but I am not sure how to correctly write the hive query for this
select a.email, b.email from dir1 a left outer join on dir1 b where b.email is null
The output I would expect is
abc#aol.com abc#aol.com
bcd#yahoo.com NULL
I only want to keep the cases where there is a NULL on the right hand side

use where b.email IS NULL
select a.email, b.email from dir1 a left outer join on dir1 b where b.email is null

Related

What's the difference between these joins?

What's the difference between
SELECT COUNT(*)
FROM TOOL T
LEFT OUTER JOIN PREVENT_USE P ON T.ID = P.TOOL_ID
WHERE
P.ID IS NULL
and
SELECT COUNT(*)
FROM TOOL T
LEFT OUTER JOIN PREVENT_USE P ON T.ID = P.TOOL_ID AND P.ID IS NULL
?
The bottom query is equivalent to
SELECT COUNT(*)
FROM TOOL T
since it is not limiting the result set but rather producing a joined table with a lot of null fields for the right part of the join.
The first query is a left anti join.

SQL left JOIN with where clause not returning all results

I have 2 tables that I join using an ID. I want all the data from my main table to show and match if that ID is in table #2 to show a few more columns in my output. That currently works with
select table1.id, table1.name, table1.phone, table1.address,
table2.loyalcustomer, table2.loyaltynumb, table2.loyaltysince from table1
left join table2
ON table1.id = table2.table1id
What I'm trying to do is the same thing, but add a WHERE clause to table2.loyalcustomer != 'Yes'. When I do that, it doesn't return all the data from my main table (table1), but instead only shows what matches between table1 and table2. Also, table2 does not have all the info, only what was inserted into the table.
select table1.id, table1.name, table1.phone, table1.address,
table2.loyalcustomer, table2.loyaltynumb, table2.loyaltysince from table1
left join table2
ON table1.id = table2.table1id
WHERE table2.loyalcustomer != 'Yes'
Been reading about different joins but what i've been reading is that my where statement may be contradicting my join and I'm not sure how to resolve that.
SQL DB: Postgres
The problem is on your WHERE clause. Be carefull with LEFT JOINS !
When you do a LEFT JOIN on a TABLE, this table wont filter the results as if it was an INNER JOIN. This is because you accept your LEFT JOIN TABLE to return entire NULL rows.
However, you are using a COLUMN from your "LEFT JOINED TABLE" in your WHERE clause when you say... "table2.loyalcustomer != 'Yes'" . This clause works when table2.loyalcustomer is not not null but it DOESN'T work if table2.loyalcustomer is NULL.
So here it goes the right way to do it :
select table1.id, ...
from table1
left join table2 ON table1.id = table2.table1id and table2.loyalcustomer != 'Yes'
Here it goes an alternative way to do it...
select table1.id, ...
from table1
left join table2 ON table1.id = table2.table1id
WHERE ISNULL(table2.loyalcustomer, '') != 'Yes'
To resume : NULL != 'Yes' doesn't work. You need something different from null to evaluate your expression.
Try this one man
SELECT table1.id, table1.name, table1.phone, table1.address,
table2.loyalcustomer, table2.loyaltynumb, table2.loyaltysince FROM users
LEFT JOIN table2
ON table1.id = table2.table1id
HAVING table2.loyalcustomer != 'Yes'

JPQL left outer join does unnecessary joins

I've got the following JPQL :
SELECT a.b.id, a.b.name, a.c.id,a.c.name
left join a.b left join a.c
group by a.b.id,a.b.name,a.c.id,a.c.name
now b and c are both referencing the same table.
the generated SQL is doing the left join I asked, and another join for a.b.name and a.c.name
(which is unnecessary because the left join includes the name, and it retrieves more results than expected)
how do I make the SQL generated not include the unnecessary join?
1 solution came up is not select the names and retrieve them individually by a different query.. but it's not the most elegant way I suppose..
(btw I tried selecting a.b,a.c and group by a.b,a.c but it throws ORA not a group by expression because the generated sql retrieves all rows but group by is only by ID)
and the left join is necessary since I want to allow null values.
Thanks a lot.
SELECT a.b.id, a.b.name, a.c.id,a.c.name
The above implicitly creates an inner join between a abd b,a nd another inner join between a and c. The query should be
select b.id, b.name, c.id, c.name
from A a
left join a.b b
left join a.c c
The group by clause doesn't make any sense, since you have no aggregate in your select clause. group by would be useful if you had, for example
select b.id, b.name, c.id, c.name, count(c.foo)
from A a
left join a.b b
left join a.c c
group by b.id, b.name, c.id, c.name

Join table variable vs join view

I have a stored procedure which is running quite slow. Therefore I want to extract some of the query in a separate view.
My code looks something like this:
DECLARE #tmpTable TABLE(..)
INSERT INTO #tmpTable (..) *query* (returns 3000 rows)
Select ... from table1
inner join table2
inner join table3
inner join #tmpTable
...
I then extract (copy-paste) the *query* and put it in a view - i.e. vView.
Doing this will then give me a different result:
Select ... from table1
inner join table2
inner join table3
inner join vView
...
Why? I can see that the vView and the #tmpTable both returns 3000 rows, so they should match (also did a except query to check).
Any comments would be much appriciated as I feel quite stuck with this..
EDITED:
This is the full query for getting the result (using #tmpTable or vView gives me different results, although the appear the same):
select dep.sid as depsid, dep.[name], COUNT(b.sid) as possiblelogins, count(ls.clientsid) as logins
from department dep
inner join relationship r on dep.sid=r.primarysid and r.relationshiptypeid=27 and r.validto is null
inner join [user] u on r.secondarysid=u.sid
inner join relationship r2 on u.sid=r2.secondarysid and r2.validto is null and r2.relationshiptypeid in (1,37)
inner join client c on r2.primarysid=c.sid
inner join ***#tmpTable or vView*** b on b.sid = c.sid
left outer join (select distinct clientsid from logonstatistics) as ls on b.sid=ls.clientsid
GROUP BY dep.sid, dep.[name],dep.isdepartment
HAVING dep.isdepartment=1
You maybe don't need the view/table if you change to this.
It joins on to client c and appears to be there only to JOIN onto logonstatistics
--remove inner join ***#tmpTable or vView*** b on b.sid = c.sid
--change JOIN
left outer join (select distinct clientsid from logonstatistics) as ls on c.sid=ls.clientsid
And change COUNT(b.sid) to COUNT(c.sid) in the SELECT clause
Otherwise, if you get different results you have two options I can see:
Table and view have different data. Have you run a line by line comparsion?
One has NULL, one has a value (especially for the sid column which will affect the JOIN)
Finally, when you says "different results" do you mean you get x2 or x3 rows? A different COUNT? What?

Is it possible to JOIN with a var from a LEFT JOIN without destrorying rows?

My code generates a large query. A simple version is
SELECT * FROM main_table as mt
JOIN user_data AS ud ON mt.user_id=ud.id
LEFT JOIN ban_Status AS bs ON ud.status_id=bs.id
JOIN AnotherTable ON bs.data=AnotherTable.id
NOTE: This code is untested.
When i remove the last join i get results. I can also change it to left join but that would be wrong. If ud.status is not null i would like a join as i always do when i do a select query from ban_Status. How do i fix this? must i write left join on every table if i left join the parent table? would that not give me side effects?
I am using sqlite ATM but will switch to tsql
Use the LEFT JOIN, but in your WHERE clause specify that either both ud.status_id is null and AnotherTable.id is null or neither is null.
SELECT * FROM main_table as mt
JOIN user_data AS ud ON mt.user_id=ud.id
LEFT JOIN ban_Status AS bs ON ud.status_id=bs.id
LEFT JOIN AnotherTable ON bs.data=AnotherTable.id
WHERE (ud.status_id is null and AnotherTable.id is null)
or (ui.status_id is not null and AnotherTable.id is not null)
That will keep you from selecting any records that have a ban_Status but don't have the additional data from the other table.