Postgresql query where statement positions - postgresql

I have two tables that I want to join. It works without where conditions. After adding where conditions, I got syntax error near a (where I give table1 an alternative name). From my understanding, the syntax looks correct?
My query
select * from table1 where date >= '2020-10-01' and date <= '2020-10-31' a
left join table2 b where registered >= '2020-10-01' and registered <= '2020-10-31' b
on a.id = cast(b.id as varchar)

Some issues:
where goes after all tables (and their join conditions)
aliases go immediately after the table name
Applying these two corrections and some formatting:
select *
from table1 a
left join table2 b on a.id = cast(b.id as varchar)
and registered >= '2020-10-01' and registered <= '2020-10-31'
where date >= '2020-10-01' and date <= '2020-10-31'
Conventionally, join conditions that describe access to joined rows (typically the keys) are coded first, then filtering conditions (ones involving only columns in the joined table) are coded last.
Which can be slightly simplified using between to:
select *
from table1 a
left join table2 b on a.id = cast(b.id as varchar)
and registered between '2020-10-01' and '2020-10-31'
where date between '2020-10-01' and '2020-10-31'

Related

SQL Group By that works in SQLite does not work in Postgres

This statement works in SQLite, but not in Postgres:
SELECT A.*, B.*
FROM Readings A
LEFT JOIN Offsets B ON A.MeterNum = B.MeterNo AND A.DateTime > B.TimeDate
WHERE A.MeterNum = 1
GROUP BY A.DateTime
ORDER BY A.DateTime DESC
The Readings table contains electric submeter readings each with a date stamp. The Offsets table holds an adjustment that the user enters after a failed meter is replaced with a new one that starts again at zero. Without the Group By statement the query returns a line for each meter reading with each prior adjustment made before the reading date while I only want the last adjustment.
All the docs I've seen on Group By for Postgres indicate I should be including an aggregate function which I don't need and can't use (The Reading column contains the Modbus string returned from the meter).
Just pick the latest reading in a derived table. In Postgres this can be done quite efficiently using distinct on ()
SELECT A.*, B.*
FROM readings A
left join (
select distinct on (meterno) o.*
from offsets o
order by o.meterno, o.timedate desc
) B ON A.MeterNum = B.MeterNo AND A.DateTime > B.TimeDate
WHERE A.meternum = 1
ORDER BY A.DateTime DESC
distinct on () will only return one row per meterno and this is the "latest" row due to the order by ... , timedate desc
The query might even be faster by pushing the condition on datetime > timedate into the derived table using a lateral join:
SELECT A.*, B.*
FROM readings A
left join lateral (
select distinct on (meterno) o.*
from offsets o
where a.datetime > o.timedeate
order by o.meterno, o.timedate desc
) B ON A.MeterNum = B.MeterNo
WHERE A.meternum = 1
ORDER BY A.DateTime DESC

Selecting records in one table by comparing fields with a single other record

I have two SQLite tables, t1 and t2 with identical fields: name, value1, value2, value3.
Critically, (a) table t1 contains only a single record John|20|19|4, and (b) that record might change.
I would like to select from T2 all those records where t2.value1 <= t1.value1 (i.e., the single t1.value in the only record) and t2.value2 <= t1.value2 and t3.value3. Is this possible?
This should do it:
select *
from T2
where exists
( select *
from T1
where T2.Value1 <= T1.Value1 and
T2.Value2 <= T1.Value2 and
T2.Value3 <= T1.Value3
)
yes it is possible. you can give it a try with below query
WITH (SELECT TOP(1) FROM table t1 AS record),
SELECT * FROM table t2 WHERE t2.value1 <= record.value1 AND t2.value2 <=
record.value2 AND t3.value3 <= record.value3;

Avoiding Order By in T-SQL

Below sample query is a part of my main query. I found SORT operator in below query is consuming 30% of the cost.
To avoid SORT, there is need of creation of Indexes. Is there any other way to optimize this code.
SELECT TOP 1 CONVERT( DATE, T_Date) AS T_Date
FROM TableA
WHERE ID = r.ID
AND Status = 3
AND TableA_ID >ISNULL((
SELECT TOP 1 TableA_ID
FROM TableA
WHERE ID = r.ID
AND Status <> 3
ORDER BY T_Date DESC
), 0)
ORDER BY T_Date ASC
Looks like you can use not exists rather than the sorts. I think you'll probably get a better performance boost by use a CTE or derived table instead of the a scalar subquery.
select *
from r ... left outer join
(
select ID, min(t_date) as min_date from TableA t1
where status = 3 and not exists (
select 1 from TableA t2
where t2.ID = t1.ID
and t2.status <> 3 and t2.t_date > t1.t_date
)
group by ID
) as md on md.ID = r.ID ...
or
select *
from r ... left outer join
(
select t1.ID, min(t1.t_date) as min_date
from TableA t1 left outer join TableA t2
on t2.ID = t1.ID and t2.status <> 3
where t1.status = 3 and t1.t_date < t2.t_date
group by t1.ID
having count(t2.ID) = 0
) as md on md.ID = r.ID ...
It also appears that you're relying on an identity column but it's not clear what those values mean. I'm basically ignoring it and using the date column instead.
Try this:
SELECT TOP 1 CONVERT( DATE, T_Date) AS T_Date
FROM TableA a1
LEFT JOIN (
SELECT ID, MAX(TableA_ID) AS MaxAID
FROM TableA
WHERE Status <> 3
GROUP BY ID
) a2 ON a2.ID = a1.ID AND a1.TableA_ID > coalesce(a2.MAXAID,0)
WHERE a1.ID = r.ID AND a1.Status = 3
ORDER BY T_Date ASC
The use of TOP 1 in combination with the unexplained r alias concern me. There's almost certainly a MUCH better way to get this data into your results that doesn't involve doing this in a sub query (unless this is for an APPLY operation).

Postgresql count by past weeks

select id, wk0_count
from teams
left join
(select team_id, count(team_id) as wk0_count
from (
select created_at, team_id, trunc(EXTRACT(EPOCH FROM age(CURRENT_TIMESTAMP,created_at)) / 604800) as wk_offset
from loan_files
where loan_type <> 2
order by created_at DESC) as t1
where wk_offset = 0
group by team_id) as t_wk0
on teams.id = t_wk0.team_id
I've created the query above that shows me how many loans each team did in a given week. Week 0 is the past seven days.
Ideally I want a table that shows how many loans each team did in the last 8 weeks, grouped by week. The output would look like:
Any ideas on the best way to do this?
select
t.id,
count(week = 0 or null) as wk0,
count(week = 1 or null) as wk1,
count(week = 2 or null) as wk2,
count(week = 3 or null) as wk3
from
teams t
left join
loan_files lf on lf.team_id = t.id and loan_type <> 2
cross join lateral
(select (current_date - created_at::date) / 7 as week) w
group by 1
In 9.4+ versions use the aggregate filter syntax:
count(*) filter (where week = 0) as wk0,
lateral is from 9.3. In a previous version move the week expression to the filter condition.
How about the following query?
SELECT team_id AS id, count(team_id) AS wk0_count
FROM teams LEFT JOIN loan_files ON teams.id = team_id
WHERE loan_type <> 2
AND trunc(EXTRACT(epoch FROM age(CURRENT_TIMESTAMP, created_at)) / 604800) = 0
GROUP BY team_id
Notable changes are:
ORDER BY clause in subquery was pointless;
created_at in innermost subquery was never used;
wk_offset test is moved on the WHERE clause and not done in two distinct steps;
outermost subquery was not needed.

postgresql complex query joing same table

I would like to get those customers from a table 'transactions' which haven't created any transactions in the last 6 Months.
Table:
'transactions'
id, email, state, paid_at
To visualise:
|------------------------ time period with all transactions --------------------|
|-- period before month transactions > 0) ---|---- curr month transactions = 0 -|
I guess this is doable with a join showing only those that didn't have any transactions on the right side.
Example:
Month = November
The conditions for the left side should be:
COUNT(l.id) > 0
l.paid_at < '2013-05-01 00:00:00'
Conditions for the right side:
COUNT(r.id) = 0
r.paid_at BETWEEN '2013-05-01 00:00:00' AND '2013-11-30 23:59:59'
Is join the right approach?
Answer
SELECT
C .email
FROM
transactions C
WHERE
(
C .email NOT IN (
SELECT DISTINCT
email
FROM
transactions
WHERE
paid_at >= '2013-05-01 00:00:00'
AND paid_at <= '2013-11-30 23:59:59'
)
AND
C .email IN (
SELECT DISTINCT
email
FROM
transactions
WHERE
paid_at <= '2013-05-01 00:00:00'
)
)
AND c.paid_at <= '2013-11-30 23:59:59'
There are a couple of ways you could do this. Use a subquery to get distinct customer ids for transactions in the last 6 months, and then select customers where their id isn't in the subquery.
select c.id, c.name
from customer c
where c.id not in (select distinct customer_id from transaction where dt between <start> and <end>);
Or, use a left join from customer to transaction, and filter the results to have transaction id null. A left join includes all rows from the left-hand table, even when there are no matching rows in the right-hand table. Explanation of left joins here: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
select c.id, c.name
from customer c
left join transaction t on c.id = t.customer_id
and t.dt between <start> and <end>
where t.id is null;
The left join approach is likely to be faster.