Postgres Aggregate function in where clause - postgresql

I am trying to get the count of leads that enrolled in the last 180 days. I have the subquery for leads as
left outer join (select nl.zip, count(distinct lsu.lead_id) as leadCount
from lead_status_updates lsu
join normalized_lead_locations nl on nl.lead_id = lsu.lead_id
where category in ('PropertySell','PropertySearch')
AND lower(status) not like ('invite%')
and DATE_PART('day', current_date - min(lsu.created)) < 181
group by nl.zip) lc on lc.Zip = cbsa.zip
but I get the error message that I can't have aggregate functions in the where clause. Anyone know how to fix this?

Related

Select distinct on value must appear in group by

I am encountering an error when trying to run the below query: "column "v.visit_id" must appear in the GROUP BY clause or be used in an aggregate function."
My question is that I believe that I am already using this column in an aggregate function on line 2 count(v.visit_id) as total_visits. Does this not count as satisfying the error? I can't add to the GROUP BY directly since that would mess up my output.
My end goal is to select distinct visit IDs while also only grouping the output by physician names.
select distinct on (v.visit_id)
count(v.visit_id) as total_visits,
sum(mad2.nsma1_ans::time - mad.nsma1_ans::time) as or_hours_utilized,
sum(esla1_bt_end[1] - esla1_bt_beg[1]) as total_block_hours,
sum(extract(epoch from mad2.nsma1_ans::time) - extract(epoch from mad.nsma1_ans::time)) /
(sum(extract(epoch from esla1_bt_end[1])) - sum(extract(epoch from esla1_bt_beg[1]))) * 100 as or_percentage,
pt1.phys1_name as surgeon
from visit as v
inner join pat_phy_relation_table as pprt
on pprt.patphys_pat_num = v.visit_id
and pprt.patphys_rel_type = 'ATTENDING'
inner join physician_table1 as pt1
on pt1.phys1_num = pprt.patphys_phy_num
and pt1.phys1_arid = v.visit_arid --need to confirm how to handle ARIDs
inner join ews_location_table2 elt2
on lpad(pt1.phys1_num::varchar, 6, '0') = any (elt2.esla1_bt_surg)
and esla1_loca in ('OR1','OR2','OR3','OR4')
and esla1_date between '2021-09-01' and '2021-09-30'
and esla1_seid = pt1.phys1_arid
inner join multi_app_documentation mad2
on mad2.nsma1_patnum = v.visit_id
and mad2.nsma1_code = 'OROUT' --only pulling visits/physicians with an OROUT
and mad2.nsma1_ans !~ '[x,X,C,END,S]' --removing non-standard data
and mad2.nsma1_ans != '' and mad2.nsma1_ans != '0' and mad2.nsma1_ans != '1' and mad2.nsma1_ans != '0000'
inner join multi_app_documentation mad
on mad.nsma1_patnum = v.visit_id
and mad.nsma1_code = 'ORINTIME' --only pulling visits/physicians with an ORINTIME
where v.visit_admit_date between '2021-09-01' and '2021-09-30'
and v.visit_arid = 5
group by pt1.phys1_name
The problem is distinct on (v.visit_id) is not an aggregate function. You'd need to add it to the group by.
select
distinct on (v.visit_id)
count(v.visit_id) as total_visits,
...
group by v.visit_id, pt1.phys1_name
However, it makes no sense to use distinct on something you're grouping by. The group by will already only show one row for each visit_id.
select
v.visit_id,
count(v.visit_id) as total_visits,
...
group by v.visit_id, pt1.phys1_name
If v.visit_id is a primary key or unique this also makes no sense. Each visit_id will only appear once and your count will always be 1. You probably want to leave it out entirely.
select
count(v.visit_id) as total_visits
...
group by pt1.phys1_name

Why is this nested INNER JOIN not working in POSTGRESQL?

I am using the Nothwind data base & working in pgAdmin, and my query is looking like this at the moment
SELECT
TO_CHAR (o.ShippedDate, 'yyyy.MM') AS Month
,o.OrderID
,Total
,SUM (Total) OVER PARTITION BY TO_CHAR (ShippedDate,
‘yyyy.MM’) ORDER BY O.OrderID) AS Running_Total
FROM public.orders O
INNER JOIN (
SELECT OrderID, SUM(Quantity*UnitPrice) AS Total
FROM public.order_details
GROUP BY OrderID
ORDER BY OrderID
) OD ON O.OrderID = OD.OrderID
WHERE
TO_CHAR (o.ShippedDate, 'yyyy.MM') IS NOT NULL
And is is not working, it says:
ERROR: column "o.shippeddate" must appear in the GROUP BY clause or be used in an aggregate function
LINE 2: TO_CHAR (o.ShippedDate, 'yyyy.MM') AS Month
Can you help me out what could be the issue? Thanks!
I fixed the query, so it is now the correct one.

Dividing 2 count statements in Postgresql

I do have a question about the division of 2 count statements below, which give me the error underneath.
(SELECT COUNT(transactions.transactionNumber)
FROM transactions
INNER JOIN account ON account.sfid = transactions.accountsfid
INNER JOIN transactionLineItems ON transactions.transactionNumber
= transactionLineItems.transactionNumber
INNER JOIN products ON transactionLineItems.USIM = products.USIM
WHERE products.gender = 'male' AND products.agegroup = 'adult'
AND transactions.transactionDate >= current_date - interval
'730' day)/
(SELECT COUNT(transactions.transactionNumber)
FROM transactions
WHERE transactions.transactionDate >=
current_date - interval '730' day)
ERROR: syntax error at or near "/"
LINE 6: ...tions.transactionDate >= current_date - interval '730' day)/``
What I think the problem is, that the my count statements are creating tables, and the division of the tables is the problem, but how can I make this division work?
Afterwards I want to check the result against a percentage, e.g. < 0.2.
Can anyone help me with this.
Is that your complete query? Something like this works in Postgres 10:
SELECT
(SELECT COUNT(id) FROM test WHERE state = false) / (SELECT COUNT(id) FROM test WHERE state = true) as y
The extra SELECT in front of both sub queries with the division is what's important. Otherwise I also get the error you mentioned.
See also my DB Fiddle version of this query.

HQL equivalent for postgresql query

I am trying to figure out the HQL equivalent of my query that has 2 subqueries, what I'm trying to do is I am getting the max amount for the past 6 months and then group them by month, and then I will get the average of the past 6 month result. And since the rows have version column, I also need to get maxed version for that specific row.
Here is my query, I'm using postgres by the way. Any help would be appreciated as I'm really having a hard time. Thanks in advance.
select avg(amount1) as maxField1 from
(
select max(amount1) as amount1 from table1 a
where a.id = :id
and a.date between :startDate
and :endDate
and a.version =
(
select max(b.version) from table1 b
where a.id = b.id
and a.date = b.date
)
group by to_char(a.date, 'YYYYMM')
);

Creating 'Empty' Records for Days of the Month Without Records

I have a very simpl postgres (9.3) query that looks like this:
SELECT a.date, b.status
FROM sis.table_a a
JOIN sis.table_b b ON a.thing_id = b.thing_id
WHERE EXTRACT(MONTH FROM a.date) = 06
AND EXTRACT(YEAR FROM a.date) = 2015
Some days of the month of June do not exist in table_a and thus are obviously not joined to table_b. What is the best way to create records for these not represented days and assign a placeholder (e.g. 'EMPTY') to their 'status' column? Is this even possible to do using pure SQL?
Basically, you need LEFT JOIN and it looks like you also need generate_series() to provide the full set of days:
SELECT d.date
, a.date IS NOT NULL AS a_exists
, COALESCE(b.status, 'status_missing') AS status
FROM (
SELECT date::date
FROM generate_series('2015-06-01'::date
, '2015-06-30'::date
, interval '1 day') date
) d
LEFT JOIN sis.table_a a USING (date)
LEFT JOIN sis.table_b b USING (thing_id)
ORDER BY 1;
Use sargable WHERE conditions. What you had cannot use a plain index on date and has to default to a much more expensive sequential scan. (There are no more WHERE conditions in my final query.)
Aside: don't use the basic type name (and reserved word in standard SQL) date as identifier.
Related (2nd chapter):
PostgreSQL: running count of rows for a query 'by minute'