Select distinct on value must appear in group by - postgresql

I am encountering an error when trying to run the below query: "column "v.visit_id" must appear in the GROUP BY clause or be used in an aggregate function."
My question is that I believe that I am already using this column in an aggregate function on line 2 count(v.visit_id) as total_visits. Does this not count as satisfying the error? I can't add to the GROUP BY directly since that would mess up my output.
My end goal is to select distinct visit IDs while also only grouping the output by physician names.
select distinct on (v.visit_id)
count(v.visit_id) as total_visits,
sum(mad2.nsma1_ans::time - mad.nsma1_ans::time) as or_hours_utilized,
sum(esla1_bt_end[1] - esla1_bt_beg[1]) as total_block_hours,
sum(extract(epoch from mad2.nsma1_ans::time) - extract(epoch from mad.nsma1_ans::time)) /
(sum(extract(epoch from esla1_bt_end[1])) - sum(extract(epoch from esla1_bt_beg[1]))) * 100 as or_percentage,
pt1.phys1_name as surgeon
from visit as v
inner join pat_phy_relation_table as pprt
on pprt.patphys_pat_num = v.visit_id
and pprt.patphys_rel_type = 'ATTENDING'
inner join physician_table1 as pt1
on pt1.phys1_num = pprt.patphys_phy_num
and pt1.phys1_arid = v.visit_arid --need to confirm how to handle ARIDs
inner join ews_location_table2 elt2
on lpad(pt1.phys1_num::varchar, 6, '0') = any (elt2.esla1_bt_surg)
and esla1_loca in ('OR1','OR2','OR3','OR4')
and esla1_date between '2021-09-01' and '2021-09-30'
and esla1_seid = pt1.phys1_arid
inner join multi_app_documentation mad2
on mad2.nsma1_patnum = v.visit_id
and mad2.nsma1_code = 'OROUT' --only pulling visits/physicians with an OROUT
and mad2.nsma1_ans !~ '[x,X,C,END,S]' --removing non-standard data
and mad2.nsma1_ans != '' and mad2.nsma1_ans != '0' and mad2.nsma1_ans != '1' and mad2.nsma1_ans != '0000'
inner join multi_app_documentation mad
on mad.nsma1_patnum = v.visit_id
and mad.nsma1_code = 'ORINTIME' --only pulling visits/physicians with an ORINTIME
where v.visit_admit_date between '2021-09-01' and '2021-09-30'
and v.visit_arid = 5
group by pt1.phys1_name

The problem is distinct on (v.visit_id) is not an aggregate function. You'd need to add it to the group by.
select
distinct on (v.visit_id)
count(v.visit_id) as total_visits,
...
group by v.visit_id, pt1.phys1_name
However, it makes no sense to use distinct on something you're grouping by. The group by will already only show one row for each visit_id.
select
v.visit_id,
count(v.visit_id) as total_visits,
...
group by v.visit_id, pt1.phys1_name
If v.visit_id is a primary key or unique this also makes no sense. Each visit_id will only appear once and your count will always be 1. You probably want to leave it out entirely.
select
count(v.visit_id) as total_visits
...
group by pt1.phys1_name

Related

why am I getting ERROR: syntax error at end of input?

I am keep getting 'syntax error at end of input' and don't know why.
What I want to do is divide result of disease by result of total with showing condition_id in disease section.
select disease.condition_id, (disease::float/total::float) as prevalence
from (
select condition_id, count(person_id)
from a.condition
where condition_id=316139
group by condition_id
) as disease
join (
select count(distinct person_id) as total
from a.person
)as total;
Can someone please help me with this?
Thanks!
I don't have an exact fix for your current syntax, but I would phrase this query as a join with an aggregation over the entire tables:
SELECT
COUNT(*) FILTER (WHERE c.condition_id = 316139) /
COUNT(DISTINCT p.person_id) AS prevalence
FROM a.person p
LEFT JOIN a.condition c
ON p.person_id = c.person_id;
The main reason for your error is the missing join condition. The join operator requires a join condition (defined using ON).
But given the structure of your query I think you don't actually want a inner join, but a cross join between the two.
Additionally the expression disease::float is trying to cast a complete row to a float value, not a single column. I assume you wanted to alias the count aggregate to something, e.g. count(person_id) as num_persons
Using total::float is also ambiguous as you have a sub-query alias with that name and a column with that name. That is highly confusing, you should avoid that.
select disease.condition_id,
(disease.num_person::float / total.total::float) as prevalence
from (
select condition_id, count(person_id) as num_person
from a.condition
where condition_id = 316139
group by condition_id
) as disease
cross join (
select count(distinct person_id) as total
from a.person
) as total

Postgresql - how to combine these two queries

I try to combine these two queries in one.
the result of these queries is the number of accepted / rejected applications for a given operator.
I want to get such a result - in three column: number of accepted applications , number of rejected applications and operators assigned to it.
select count(applications.id) as number_of_applications, operator_id
from applications
inner join travel p on applications.id = p.application_id
inner join trip_details sp on p.id = sp.trip_id
where application_status ilike '%rejected%'
group by operator_id
order by number_of_applications desc;
select count(applications.id) as number_of_applications, operator_id
from applications
inner join travel p on applications.id = p.application_id
inner join trip_details sp on p.id = sp.trip_id
where application_status ilike '%accepted%'
group by operator_id
order by number_of_applications desc;
With conditional aggregation:
select
sum(case when application_status ilike '%accepted%' then 1 else 0 end) as number_of_applications_accepted,
sum(case when application_status ilike '%rejected%' then 1 else 0 end) as number_of_applications_rejected,
operator_id
from applications
inner join travel p on applications.id = p.application_id
inner join trip_details sp on p.id = sp.trip_id
where (application_status ilike '%rejected%') or (application_status ilike '%accepted%')
group by operator_id;
You can add the ordering that you prefer.

Postgres string_agg function not recognized as aggregate function

I am attempting to run this query
SELECT u.*, string_agg(CAST(uar.roleid AS VARCHAR(100)), ',') AS roleids, string_agg(CAST(r.role AS VARCHAR(100)), ',') AS systemroles
FROM idpro.users AS u
INNER JOIN idpro.userapplicationroles AS uar ON u.id = uar.userid
INNER JOIN idpro.roles AS r ON r.id = uar.roleid
GROUP BY u.id, uar.applicationid
HAVING u.organizationid = '77777777-f892-4f4a-8328-c31df32bd6ba'
AND uar.applicationid = 'd88fbf05-c048-4697-8bf3-036f39897183'
AND (u.statusid = '7f9f0b75-44b7-4216-bf2a-03abc47dcff8')
AND uar.roleid IN ('cc9ada1c-fa21-400b-be98-c563ebb65a9c','de087148-4788-43da-89e2-dd7dff097735');
However, I'm getting an error stating that
ERROR: column "uar.roleid" must appear in the GROUP BY clause or be used in an aggregate function
LINE 9: AND uar.roleid IN ('cc9ada1c-fa21-400b-be98-c563ebb65a9c','...
string_agg() IS an aggregate function, is it not? My intent, if it isn't obvious, is to return each user record with the roleids and rolenames in comma-delimited lists. If I am doing everything wrong, could you please point me in the right direction?
You are filtering the data, so a WHERE clause would be needed. This tutorial is worth reading.
SELECT u.*,
string_agg(CAST(uar.roleid AS VARCHAR(100)), ',') AS roleids,
string_agg(CAST(r.role AS VARCHAR(100)), ',') AS systemroles
FROM idpro.users AS u
INNER JOIN idpro.userapplicationroles AS uar ON u.id = uar.userid
INNER JOIN idpro.roles AS r ON r.id = uar.roleid
WHERE u.organizationid = '77777777-f892-4f4a-8328-c31df32bd6ba'
AND uar.applicationid = 'd88fbf05-c048-4697-8bf3-036f39897183'
AND (u.statusid = '7f9f0b75-44b7-4216-bf2a-03abc47dcff8')
AND uar.roleid IN ('cc9ada1c-fa21-400b-be98-c563ebb65a9c','de087148-4788-43da-89e2-dd7dff097735');
GROUP BY u.id, uar.applicationid
The HAVING clause is helpful for filtering the aggregated values or the groups.
Since you are grouping by u.id, the table primary key you have access to every column of the u table. You can either use a where clause or a having clause.
For uar.applicationid, it is part of the group by so you can also use either a where or a having.
uar.roleid is not part of the group by clause, so to be usable in the having clause, you would have to consider the aggregated value.
The following example filters out rows whose aggregated length is more than 10 chars.
HAVING length(string_agg(CAST(uar.roleid AS VARCHAR(100)), ',')) > 10
A more common usage, on numerical field, is to filter out if the number of aggregated rows is less than a threshold (having count(*) > 2) or a sum of some kind (having sum(vacation_days) > 21)

How to avoid duplicates in the STRING_AGG function

My query is below:
select
u.Id,
STRING_AGG(sf.Naziv, ', ') as 'Ustrojstvena jedinica',
ISNULL(CONVERT(varchar(200), (STRING_AGG(TRIM(p.Naziv), ', ')), 121), '')
as 'Partner',
from Ugovor as u
left join VezaUgovorPartner as vup
on vup.UgovorId = u.Id AND vup.IsDeleted = 'false'
left join [TEST_MaticniPodaci2].dbo.Partner as p
on p.PartnerID = vup.PartnerId
left join [dbo].[VezaUgovorUstrojstvenaJedinica] as vuu
on vuu.UgovorId = u.Id
left join [TEST_MaticniPodaci2].hcphs.SifZavod as sf
on sf.Id = vuu.UstrojstvenaJedinicaId
left join [dbo].[SifVrstaUgovora] as vu
on u.VrstaUgovoraId = vu.Id
group by u.Id, sf.Naziv
My problem is that I can have more sf.Naziv and also only one sf.Naziv so I have to check if there is one and then show only one result and if there is two or more to show more results. But for now the problem is when I have only one sf.Naziv, query returns two sf.Naziv with the same name because in first STRING_AGG i have more records about p.Naziv.
I have no idea how to implement DISTINCT into STRING_AGG function
Any other solutions are welcome, but I think it should work with DISTINCT function.
It looks like distinct won't work, so what you should do is put your whole query in a subquery, remove the duplicates there, then do STRING_AGG on the data that has no duplicates.
SELECT STRING_AGG(data)
FROM (
SELECT DISTINCT FROM ...
)
I like this format for distinct values:
(d is required but you can use any variable name there)
SELECT STRING_AGG(LoadNumber, ',') as LoadNumbers FROM (SELECT DISTINCT LoadNumber FROM [ASN]) d
A sample query to remove duplicates while using STRING_AGG().
WITH cte AS (
SELECT DISTINCT product
FROM activities
)
SELECT STRING_AGG(product, ',') products
FROM cte;
Or you can use the following query. The result is same -
SELECT STRING_AGG(product, ',') as products
from (
SELECT product
FROM Activities
GROUP BY product
) as _ ;

MariaDB - order by with more selects

I have this SQL:
select * from `posts`
where `posts`.`deleted_at` is null
and `expire_at` >= '2017-03-26 21:23:42.000000'
and (
select count(distinct tags.id) from `tags`
inner join `post_tag` on `tags`.`id` = `post_tag`.`tag_id`
where `post_tag`.`post_id` = `posts`.`id`
and (`tags`.`tag` like 'PHP' or `tags`.`tag` like 'pop' or `tags`.`tag` like 'UI')
) >= 1
Is it possible order the results by number of tags in posts?
Maybe add there alias?
Any information can help me.
Convert your correlated subquery into a join:
select p.*
from posts p
join (
select pt.post_id,
count(distinct t.id) as tag_count
from tags t
inner join post_tag pt on t.id = pt.tag_id
where t.tag in ('PHP', 'pop', 'UI')
group by pt.post_id
) pt on p.id = pt.post_id
where p.deleted_at is null
and p.expire_at >= '2017-03-26 21:23:42.000000'
order by pt.tag_count desc;
Also, note that I changed the bunch of like and or to single IN because you are not matching any pattern i.e. there is no % in the string. So, better using single IN instead.
Also, if you have defined your table names, column names etc keeping keywords etc in mind, you shouldn't have the need to use the backticks. They make reading a query difficult.