PostgreSQL Join with special condition - postgresql

Lets assume we have the following table1:
1 2 3
a x m
a y m
b z m
I want to do an inner join on the table
INNER JOIN tabel2 ON table1.2 = table2.2
Somehow like this, but additional a condition that the value of table1.1 not unique. Thus on table1.1 = b no inner join will occure in this example.
What is the best way to achieve this?

Using a an aggregate in a sub query is how I would do it
SELECT *
FROM table1
JOIN table2
ON table1."2" = table2."2"
JOIN (
SELECT "1"
FROM table1
GROUP BY "1"
HAVING COUNT(*) > 1
) AS sub_q
ON sub_q."1" = table1."1";
Another option might be a cte or temporary table to hold the rows you're joining on
WITH _cte AS
(
SELECT "1"
FROM table1
GROUP BY "1"
HAVING COUNT(*) > 1
)
SELECT *
FROM table1
JOIN table2
ON table1."2" = table2."2"
JOIN _cte AS cte
ON cte."1" = table1."1";
temp table:
CREATE TEMPORARY TABLE _tab
(
"1" varchar
);
INSERT INTO _tab
SELECT "1"
FROM table1
GROUP BY "1"
HAVING COUNT(*) > 1;
SELECT *
FROM table1
JOIN table2
ON table1."2" = table2."2"
JOIN _tab AS tab
ON tab."1" = table1."1";

Related

postgresql: how to get three rows with three different conditions at once from same table

I have a table table1. with columns sn, rt and type
I want to get rows with different rt (i.e rt = 1,2,3) column conditions
(SELECT *
FROM table1
WHERE sn = 'testing' AND rt = 1 AND type = 'pump'
ORDER BY id DESC
LIMIT 1)
UNION
(SELECT *
FROM table1
WHERE sn = 'testing' AND rt = 2 AND type = 'pump'
ORDER BY id DESC
LIMIT 1)
UNION
(SELECT *
FROM table1
WHERE sn = 'testing' AND rt = 3 AND type = 'pump'
ORDER BY id DESC
LIMIT 1)
Currently i am trying the above.
Which is the effective way to get the rows
Use ROW_NUMBER() window function:
SELECT t.*
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY rt ORDER BY id DESC) rn
FROM table1
WHERE sn = 'testing' AND type = 'pump' AND rt IN (1, 2, 3)
) t
WHERE t.rn = 1
You can omit AND rt IN (1, 2, 3) if 1, 2 and 3 are the only possible values for rt.
You'll want to do JOINs between the table and itself, then specify the conditions in the singular WHERE clause. I've done a rough sketch of it below:
SELECT t1.*, t2.*, t3.*
FROM table1 AS t1
JOIN table1 AS t2
ON t1.sn = t2.sn AND t1.type = t2.type -- If you've got more reasonable connections between your datapoints, use them here
JOIN table1 AS t3
ON t.sn = t3.sn AND t1.type = t3.type
WHERE t1.rt = 1 AND t2.rt = 2 AND t3.rt = 3
ORDER BY t1.id DESC
LIMIT 1
Depending on what other requirements you have, you may have to tweak some parts of that. If you want results when t1 has a value but t2 or t3 doesn't you can use a LEFT JOIN instead.

How to use aggregate functions when using recursive query in postgresql

On multiple iteration on a recursive query in postgresql, I have got the following result when i run the below query
WITH recursive report AS (
select a.name, a.id, a.parentid, sum(b.id)
from table1 a
INNER JOIN table2 b on a.id=b.table1id
GROUP by a.name, a.id, a.parentid
), report2 AS (
SELECT , 0 as lvl
FROM report
WHERE parentid IS NULL
UNION ALL
SELECT child., parent.lvl + 1
FROM report child
JOIN report2 parent
ON parent.id = child.parentid
)
select * from report2
I want to sum the count column with the top most level, so my output should be like below,
What is the best possible way to get it.
If you calculate a path during recursion, like so:
WITH recursive report AS (
select a.name, a.id, a.parentid, sum(b.id) -- Is summing b.id the right thing here?
from table1 a
INNER JOIN table2 b on a.id=b.table1id
GROUP by a.name, a.id, a.parentid
), report2 AS (
SELECT report.*, 0 as lvl, array[report.id] as path_array
FROM report
WHERE parentid IS NULL
UNION ALL
SELECT child.*, parent.lvl + 1, report2.path_array||report.id
FROM report child
JOIN report2 parent
ON parent.id = child.parentid
)
select * from report2;
Do you really mean sum(b.id) and not count(*) in the report CTE?
You can get the sum of count for your top levels using this query as the main query from your recursion:
select t.name, sum(r.count) as total_count
from report2 r
join table1 t
on t.id = r.path_array[1]
group by t.name;

Avoiding Order By in T-SQL

Below sample query is a part of my main query. I found SORT operator in below query is consuming 30% of the cost.
To avoid SORT, there is need of creation of Indexes. Is there any other way to optimize this code.
SELECT TOP 1 CONVERT( DATE, T_Date) AS T_Date
FROM TableA
WHERE ID = r.ID
AND Status = 3
AND TableA_ID >ISNULL((
SELECT TOP 1 TableA_ID
FROM TableA
WHERE ID = r.ID
AND Status <> 3
ORDER BY T_Date DESC
), 0)
ORDER BY T_Date ASC
Looks like you can use not exists rather than the sorts. I think you'll probably get a better performance boost by use a CTE or derived table instead of the a scalar subquery.
select *
from r ... left outer join
(
select ID, min(t_date) as min_date from TableA t1
where status = 3 and not exists (
select 1 from TableA t2
where t2.ID = t1.ID
and t2.status <> 3 and t2.t_date > t1.t_date
)
group by ID
) as md on md.ID = r.ID ...
or
select *
from r ... left outer join
(
select t1.ID, min(t1.t_date) as min_date
from TableA t1 left outer join TableA t2
on t2.ID = t1.ID and t2.status <> 3
where t1.status = 3 and t1.t_date < t2.t_date
group by t1.ID
having count(t2.ID) = 0
) as md on md.ID = r.ID ...
It also appears that you're relying on an identity column but it's not clear what those values mean. I'm basically ignoring it and using the date column instead.
Try this:
SELECT TOP 1 CONVERT( DATE, T_Date) AS T_Date
FROM TableA a1
LEFT JOIN (
SELECT ID, MAX(TableA_ID) AS MaxAID
FROM TableA
WHERE Status <> 3
GROUP BY ID
) a2 ON a2.ID = a1.ID AND a1.TableA_ID > coalesce(a2.MAXAID,0)
WHERE a1.ID = r.ID AND a1.Status = 3
ORDER BY T_Date ASC
The use of TOP 1 in combination with the unexplained r alias concern me. There's almost certainly a MUCH better way to get this data into your results that doesn't involve doing this in a sub query (unless this is for an APPLY operation).

limiting a correlated subquery to just one record

I am trying to use a correlated subquery, but I am trying to limit it to the "best" record. When I use SQL very similiar to what follows, I get two rows per BigTable.identifier, and I wish to have only one. In the 'UNION' statement, the second half is more desirable than the first half. However, sometimes the first half will be needed. Any ideas? Here's the code:
select
BigTable.identifier,
Correlated.ID,
Correlated.Effective_Date,
Correlated.Period_Number
from
BigTable
inner join
(
select
TOP 2147483647
Table3.identifier,
Table4.Effective_Date,
Table4.Period_Number
from
Table3
inner join Table4 on Table3.matching_key = Table4.matching_key
where
Table4.Period_Number = 0
order by Table4.Effective_Date desc
UNION
select
TOP 2147483647
Table3.Identifer,
Table4.Effective_Date,
Table4.Period_Number
from
Table3
inner join Table5 on Table3.matching-key = Table5.matching-key
inner join Table4 on Table5.key1 = Table4.key1 and
Table5.key2 = Table4.key2
where
Table4.period_number = 1
order by Table4.Effective_Date desc
) as Correlated
on BigTable.identifier = Correlated.identifier
If each sub-query in that UNION had some condition which EXCLUDED the row if it was less-preferred, you would never see the less-preferred rows in the UNION.
So, if each were to have a NOT EXISTS (.... a better row in the other side of the union ....), you would eliminate less-preferred rows at the root.
I'm not clear on how you want to use effective date. Assuming you mean that you prefer Period=1 but if the Effective date is less you prefer Period=0, then something like this might work.
select
BigTable.identifier,
Correlated.ID,
Correlated.Effective_Date,
Correlated.Period_Number
from
BigTable
inner join
(
select
TOP 2147483647
Table3.identifier,
Table4.Effective_Date,
Table4.Period_Number
from
Table3
inner join Table4 on Table3.matching_key = Table4.matching_key
where
Table4.Period_Number = 0
AND NOT EXISTS
(select 1
from Table5 T5 inner join Table4 T4
on T5.key1 = T4.key1 and T5.key2 = T4.key2
where Table3.matching-key = T5.matching-key
and (T4.Effective_Date >= Table4.Effective_Date and T4.Period_Number = 1)
)
order by Table4.Effective_Date desc
UNION
select
TOP 2147483647
Table3.Identifer,
Table4.Effective_Date,
Table4.Period_Number
from
Table3
inner join Table5 on Table3.matching-key = Table5.matching-key
inner join Table4 on Table5.key1 = Table4.key1 and
Table5.key2 = Table4.key2
where
Table4.period_number = 1
AND NOT EXISTS
(select 1
from Table4 T4
where Table3.matching-key = T4.matching-key
and (T4.Period_Number > 0)
and (T4.Effective_Date > Table4.Effective_Date and T4.Period_Number = 0)
)
order by Table4.Effective_Date desc
) as Correlated
on BigTable.identifier = Correlated.identifier

Get current record for a subquery in T-SQL

I'm trying to select all records from a table "Table1" but I want a new column called "HasException" that contains a "0" or a "1". "HasException" must be "0" if the count of row matching the current Id from "Table2" is equal to 0, else it returns 1.
Here's what I've done so far, but it doesn't works:
SELECT *,
CONVERT(bit, (CASE WHEN (SELECT count(Id) FROM Table2 WHERE Table1.Id=Table2.Id) = 0 THEN 0 ELSE 1 END)) AS HasException
FROM Table1
You want to join the tables (and group on ID) before you can compare the two values like this:
SELECT dbo.Table_1.*,
CASE WHEN COUNT(dbo.Table_2.ID) = 0 THEN
0
ELSE
1
END
AS HasException
FROM dbo.Table_1 LEFT OUTER JOIN
dbo.Table_2 ON dbo.Table_1.ID = dbo.Table_2.ID
GROUP BY dbo.Table_1.ID
perhaps something like, assuming you meant table2?
SELECT *,
CAST(CASE WHEN COUNT(table2.id) = 0 THEN 0 ELSE 1 END AS bit) AS HasException
FROM
Table1
LEFT JOIN
Table2 ON Table1.Id=Table2.Id
GROUP BY
Table1.id
select
T1.*,
case when T2.Id is null then 0 else 1 end as HasException
from Table1 as T1
left outer join
(
select distinct Id
from Table2
) as T2
on T1.Id = T2.Id