I am trying to some sql code that combines information from 2 tables and uses case but it is not returning all the data.
SELECT TABLE1.PRODUCT, TABLE1.TYPE, TABLE1.AMOUNT,
(CASE
WHEN TABLE1.PRODUCT = 'RADIO'
THEN 100
ELSE 200
END) AS PRODUCT_CODE,
(CASE
WHEN TABLE1.TYPE = 'NEW'
THEN 'Y'
ELSE TABLE2.AGE
END) AS STATUS
FROM TABLE1 LEFT JOIN TABLE 2 ON TABLE1.TID = TABLE2.TID
WHERE TABLE1.DATE > '01-AUG-15'
AND TABLE2.DATE = '02-AUG-15'
The problem I am having is that I need all records from table1 and those that apply from table2 but when the query is returning less rows than there are in table 1.
Your problem is here
AND TABLE2.DATE = '02-AUG-15'
If Table2.Date is null due to the left join, this condition fails, which is why you are getting less rows than Table1. Adding a condition to the where clause from a left joined table is effectively turning it into an inner join.
Try either
AND (TABLE2.DATE = '02-AUG-15' OR TABLE2.DATE IS NULL)
(this assumes that TABLE2.DATE is not nullable) or put the condition in the join statement instead
FROM TABLE1 LEFT JOIN TABLE 2 ON TABLE1.TID = TABLE2.TID AND TABLE2.DATE = '02-AUG-15'
Related
I'm making a query with having multiple non aggregated columns with group by clause but Postgres is throwing an error that I have to add non aggregated columns in group by or use any aggregate function on that column this is the query that I'm trying to run.
select
tb1.pipeline as pipeline_id,
tb3.pipeline_name as pipeline_name,
tb2."name" as integration_name,
cast(tb1.integration_id as VARCHAR) as integration_id,
tb1.created_at as created_at,
cast(tb1.id as VARCHAR) as batch_id,
sum(tb1.row_select) as row_select,
sum(tb1.row_insert) as row_insert,
from
table1 tb1
join
table2 tb2 on tb1.integration_id = tb2.id
join
table3 tb3 on tb1.pipeline = tb3.id
where
tb1.pipeline is not null
and tb1.is_super_parent = false
group by
tb1.pipeline
and I found one solution/hack for this error that is I added max function in all other non aggregated columns this solves my problem.
select
tb1.pipeline as pipeline_id,
max(tb3.pipeline_name) as pipeline_name,
max(tb2."name") as integration_name,
max(cast(tb1.integration_id as VARCHAR)) as integration_id,
max(tb1.created_at) as created_at,
max(cast(tb1.id as VARCHAR)) as batch_id,
sum(tb1.row_select) as row_select,
sum(tb1.row_insert) as row_insert,
from
table1 tb1
join
table2 tb2 on tb1.integration_id = tb2.id
join
table3 tb3 on tb1.pipeline = tb3.id
where
tb1.pipeline is not null
and tb1.is_super_parent = false
group by
tb1.pipeline
But I don't want to add max functions when there is no need for that second thing is that applying max to all other column query will be expensive so any other better approach that I can do to solve the above issue, thanks in advance.
Well the first thing you need is to learn to format your queries in so as to get an idea of their flow at a glance. Note due to the extra comma in row_insert, from your query will give a syntax error. With that said; How do you solve your issue?
You cannot avoid the additional aggregates or the expanded group by as long as the exist in the scope same query. You need to separate the aggregation from selection of additional columns. You basically have 2 choices:
Perform the aggregation in a CTE.
with sums (pipeline_id, row_select, row_insert) as
( select tb1.pipeline
, sum(tb1.row_select) as row_select
, sum(tb1.row_insert) as row_insert
table1 tb1
where tb1.pipeline is not null
and tb1.is_super_parent = false
group by tb1.pipeline
)
select s.pipeline_id
, tbl3.pipeline_name
, tb2."name" integration_name
, s.row_select
, s.row_insert
from sums s
join table2 tbl2 on (s.pipeline_id = tb2.id)
join table3 tbl3 on (s.pipeline_id = tb3.id);
Perform the aggregation in a sub-query.
select s.pipeline_id
, tbl3.pipeline_name
, tb2."name" integration_name
, s.row_select
, s.row_insert
from ( select tb1.pipeline
, sum(tb1.row_select) as row_select
, sum(tb1.row_insert) as row_insert
table1 tb1
where tb1.pipeline is not null
and tb1.is_super_parent = false
group by tb1.pipeline
) s
join table2 tbl2 on (s.pipeline_id = tb2.id)
join table3 tbl3 on (s.pipeline_id = tb3.id);
NOTE: Not tested as no sample data supplied.
I have 2 tables that I join using an ID. I want all the data from my main table to show and match if that ID is in table #2 to show a few more columns in my output. That currently works with
select table1.id, table1.name, table1.phone, table1.address,
table2.loyalcustomer, table2.loyaltynumb, table2.loyaltysince from table1
left join table2
ON table1.id = table2.table1id
What I'm trying to do is the same thing, but add a WHERE clause to table2.loyalcustomer != 'Yes'. When I do that, it doesn't return all the data from my main table (table1), but instead only shows what matches between table1 and table2. Also, table2 does not have all the info, only what was inserted into the table.
select table1.id, table1.name, table1.phone, table1.address,
table2.loyalcustomer, table2.loyaltynumb, table2.loyaltysince from table1
left join table2
ON table1.id = table2.table1id
WHERE table2.loyalcustomer != 'Yes'
Been reading about different joins but what i've been reading is that my where statement may be contradicting my join and I'm not sure how to resolve that.
SQL DB: Postgres
The problem is on your WHERE clause. Be carefull with LEFT JOINS !
When you do a LEFT JOIN on a TABLE, this table wont filter the results as if it was an INNER JOIN. This is because you accept your LEFT JOIN TABLE to return entire NULL rows.
However, you are using a COLUMN from your "LEFT JOINED TABLE" in your WHERE clause when you say... "table2.loyalcustomer != 'Yes'" . This clause works when table2.loyalcustomer is not not null but it DOESN'T work if table2.loyalcustomer is NULL.
So here it goes the right way to do it :
select table1.id, ...
from table1
left join table2 ON table1.id = table2.table1id and table2.loyalcustomer != 'Yes'
Here it goes an alternative way to do it...
select table1.id, ...
from table1
left join table2 ON table1.id = table2.table1id
WHERE ISNULL(table2.loyalcustomer, '') != 'Yes'
To resume : NULL != 'Yes' doesn't work. You need something different from null to evaluate your expression.
Try this one man
SELECT table1.id, table1.name, table1.phone, table1.address,
table2.loyalcustomer, table2.loyaltynumb, table2.loyaltysince FROM users
LEFT JOIN table2
ON table1.id = table2.table1id
HAVING table2.loyalcustomer != 'Yes'
I'm full outer joining two tables. Table 1 (LEAD) has 689,189 rows and table 2 (CONTACT) has 133,318 rows, and a full outer join on them returns 738,959 rows. So far this makes sense.
Each table has a field that indicates whether the record is logically deleted, with a Y value meaning it is. I only want to return rows in both tables that have a value not equal to Y. When I add the additional criteria to the query
select COUNT(*)
from LEAD l
full join CONTACT c on l.CONVERTEDCONTACTID = c.ID and l.DELETE_FLAG <> 'Y' and c.DELETE_FLAG <> 'Y'
I get more rows returned than without them. Shouldn't they make the query more restrictive? I know I can perform the join using subqueries but I'm just not understanding how I'm arriving at this result.
Short answer: put the new criteria in the where clause instead.
select COUNT(*)
from LEAD l
full join CONTACT c on l.CONVERTEDCONTACTID = c.ID
where l.DELETE_FLAG <> 'Y' and c.DELETE_FLAG <> 'Y'
The extra join conditions are adding more records with nulls from the table on the other side of the join, leading to extra results.
I'm not 100% sure on how a full join with a condition that only effects one side is handled; l.DELETE_FLAG <> 'Y' and c.DELETE_FLAG <> 'Y' each only look at one of the two tables in the join, and I'm reasonably certain that is the cause.
Just put the additional criteria in WHERE clause, like following:
select COUNT(*)
from LEAD l
full join CONTACT c on l.CONVERTEDCONTACTID = c.ID
where l.DELETE_FLAG <> 'Y' and c.DELETE_FLAG <> 'Y'
If your tables contain NULL values, then tune it like following:
select COUNT(*)
from LEAD l
full join CONTACT c on l.CONVERTEDCONTACTID = c.ID
where (l.DELETE_FLAG <> 'Y' or l.DELETE_FLAG is NULL) and
(c.DELETE_FLAG <> 'Y' or c.DELETE_FLAG is NULL)
In SQL Server, I know for sure that the following query;
SELECT things.*
FROM things
LEFT OUTER JOIN (
SELECT thingreadings.thingid, reading
FROM thingreadings
INNER JOIN things on thingreadings.thingid = things.id
ORDER BY reading DESC LIMIT 1) AS readings
ON things.id = readings.thingid
WHERE things.id = '1'
Would join against thingreadings only once the WHERE id = 1 had restricted the record set down. It left joins against just one row. However in order for performance to be acceptable in postgres, I have to add the WHERE id= 1 to the INNER JOIN things on thingreadings.thingid = things.id line too.
This isn't ideal; is it possible to force postgres to know that what I am joining against is only one row without explicitly adding the WHERE clauses everywhere?
An example of this problem can be seen here;
I am trying to recreate the following query in a more efficient way;
SELECT things.id, things.name,
(SELECT thingreadings.id FROM thingreadings WHERE thingid = things.id ORDER BY id DESC LIMIT 1),
(SELECT thingreadings.reading FROM thingreadings WHERE thingid = things.id ORDER BY id DESC LIMIT 1)
FROM things
WHERE id IN (1,2)
http://sqlfiddle.com/#!15/a172c/2
Not really sure why you did all that work. Isn't the inner query enough?
SELECT t.*
FROM thingreadings tr
INNER JOIN things t on tr.thingid = t.id AND t.id = '1'
ORDER BY tr.reading DESC
LIMIT 1;
sqlfiddle demo
When you want to select the latest value for each thingID, you can do:
SELECT t.*,a.reading
FROM things t
INNER JOIN (
SELECT t1.*
FROM thingreadings t1
LEFT JOIN thingreadings t2
ON (t1.thingid = t2.thingid AND t1.reading < t2.reading)
WHERE t2.thingid IS NULL
) a ON a.thingid = t.id
sqlfiddle demo
The derived table gets you the record with the most recent reading, then the JOIN gets you the information from things table for that record.
The where clause in SQL applies to the result set you're requesting, NOT to the join.
What your code is NOT saying: "do this join only for the ID of 1"...
What your code IS saying: "do this join, then pull records out of it where the ID is 1"...
This is why you need the inner where clause. Incidentally, I also think Filipe is right about the unnecessary code.
I am trying to use a correlated subquery, but I am trying to limit it to the "best" record. When I use SQL very similiar to what follows, I get two rows per BigTable.identifier, and I wish to have only one. In the 'UNION' statement, the second half is more desirable than the first half. However, sometimes the first half will be needed. Any ideas? Here's the code:
select
BigTable.identifier,
Correlated.ID,
Correlated.Effective_Date,
Correlated.Period_Number
from
BigTable
inner join
(
select
TOP 2147483647
Table3.identifier,
Table4.Effective_Date,
Table4.Period_Number
from
Table3
inner join Table4 on Table3.matching_key = Table4.matching_key
where
Table4.Period_Number = 0
order by Table4.Effective_Date desc
UNION
select
TOP 2147483647
Table3.Identifer,
Table4.Effective_Date,
Table4.Period_Number
from
Table3
inner join Table5 on Table3.matching-key = Table5.matching-key
inner join Table4 on Table5.key1 = Table4.key1 and
Table5.key2 = Table4.key2
where
Table4.period_number = 1
order by Table4.Effective_Date desc
) as Correlated
on BigTable.identifier = Correlated.identifier
If each sub-query in that UNION had some condition which EXCLUDED the row if it was less-preferred, you would never see the less-preferred rows in the UNION.
So, if each were to have a NOT EXISTS (.... a better row in the other side of the union ....), you would eliminate less-preferred rows at the root.
I'm not clear on how you want to use effective date. Assuming you mean that you prefer Period=1 but if the Effective date is less you prefer Period=0, then something like this might work.
select
BigTable.identifier,
Correlated.ID,
Correlated.Effective_Date,
Correlated.Period_Number
from
BigTable
inner join
(
select
TOP 2147483647
Table3.identifier,
Table4.Effective_Date,
Table4.Period_Number
from
Table3
inner join Table4 on Table3.matching_key = Table4.matching_key
where
Table4.Period_Number = 0
AND NOT EXISTS
(select 1
from Table5 T5 inner join Table4 T4
on T5.key1 = T4.key1 and T5.key2 = T4.key2
where Table3.matching-key = T5.matching-key
and (T4.Effective_Date >= Table4.Effective_Date and T4.Period_Number = 1)
)
order by Table4.Effective_Date desc
UNION
select
TOP 2147483647
Table3.Identifer,
Table4.Effective_Date,
Table4.Period_Number
from
Table3
inner join Table5 on Table3.matching-key = Table5.matching-key
inner join Table4 on Table5.key1 = Table4.key1 and
Table5.key2 = Table4.key2
where
Table4.period_number = 1
AND NOT EXISTS
(select 1
from Table4 T4
where Table3.matching-key = T4.matching-key
and (T4.Period_Number > 0)
and (T4.Effective_Date > Table4.Effective_Date and T4.Period_Number = 0)
)
order by Table4.Effective_Date desc
) as Correlated
on BigTable.identifier = Correlated.identifier