Postgres DISTINCT Query issue

Postgres DISTINCT Query issue - postgresql

SELECT DISTINCT "Users"."id" , "Users".name,
"Users"."surname", "Users"."gender",
"Users"."dob", "Searches"."start_date"
FROM "Users"
LEFT JOIN "Searches" ON "Users"."id" = "Searches"."user_id"
WHERE (SQRT( POW(69.1 * ("Users"."latitude" - 45.465454), 2) + POW(69.1 * (9.186515999999983 - "Users"."longitude") * COS("Users"."latitude" / 57.3), 2))) < 20
AND "Users"."status" = true
AND "Users"."id" != 18
AND "Searches"."activity" = \'clubbing\'
AND "Users"."gender" = \'m\'
AND "Users"."age" BETWEEN 18 AND 30
ORDER BY ABS( "Searches"."start_date" - date \'2016-07-07\' )
For some reasons the above query returns the following error:
for SELECT DISTINCT, ORDER BY expressions must appear in select list
I only want to return unique users but I really don't know what's wrong with it.
Thanks for your help

Just doing what the error message says I would include the expression ABS( "Searches"."start_date" - date '2016-07-07' ) in the SELECT list. No need to change your query logic.
absdiffdate can be discarded later when processing the result.
SELECT DISTINCT "Users"."id" , "Users".name,
"Users"."surname", "Users"."gender",
"Users"."dob", "Searches"."start_date",
ABS( "Searches"."start_date" - date '2016-07-07' ) absdiffdate
FROM "Users"
LEFT JOIN "Searches" ON "Users"."id" = "Searches"."user_id"
WHERE (SQRT( POW(69.1 * ("Users"."latitude" - 45.465454), 2) + POW(69.1 * (9.186515999999983 - "Users"."longitude") * COS("Users"."latitude" / 57.3), 2))) < 20
AND "Users"."status" = true
AND "Users"."id" != 18
AND "Searches"."activity" = 'clubbing'
AND "Users"."gender" = 'm'
AND "Users"."age" BETWEEN 18 AND 30
ORDER BY ABS( "Searches"."start_date" - date '2016-07-07' )
Will this new column results in possibly more records when DISTINCT is applied?
I don't think so because you are subtracting a constant from start_date and for similar start_date corresponds similar outcome.

In Postgres, you can use DISTINCT ON to get one row per user id:
SELECT DISTINCT ON (u."id") u."id", u.name, u."surname", u."gender", u."dob", s."start_date"
FROM "Users" u LEFT JOIN
"Searches" s
ON u."id" = s."user_id"
WHERE (SQRT( POW(69.1 * (u."latitude" - 45.465454), 2) + POW(69.1 * (9.186515999999983 - u."longitude") * COS(u."latitude" / 57.3), 2))) < 20 AND
u."status" = true AND
u."id" != 18 AND "Searches"."activity" = \'clubbing\' AND
u."gender" = \'m\' AND
u."age" BETWEEN 18 AND 30
ORDER BY users.id, ABS(s."start_date" - date \'2016-07-07\' );
Notice how table aliases make the query easier to write and to read.

Related

Is there a smarter method to create series with different intervalls for count within a query?

I want to create different intervalls:
0 to 10 steps 1
10 to 100 steps 10
100 to 1.000 steps 100
1.000 to 10.000 steps 1.000
to query a table for count the items.
with "series" as (
(SELECT generate_series(0, 10, 1) AS r_from)
union
(select generate_series(10, 90, 10) as r_from)
union
(select generate_series(100, 900, 100) as r_from)
union
(select generate_series(1000, 9000, 1000) as r_from)
order by r_from
)
, "range" as ( select r_from
, case
when r_from < 10 then r_from + 1
when r_from < 100 then r_from + 10
when r_from < 1000 then r_from + 100
else r_from + 1000
end as r_to
from series)
select r_from, r_to,(SELECT count(*) FROM "my_table" WHERE "my_value" BETWEEN r_from AND r_to) as "Anz."
FROM "range";

I think generate_series is the right way, there is another way, we can use simple math to calculate the numbers.
SELECT 0 as r_from,1 as r_to
UNION ALL
SELECT power(10, steps ) * v ,
power(10, steps ) * v + power(10, steps )
FROM generate_series(1, 9, 1) v
CROSS JOIN generate_series(0, 3, 1) steps
so that might as below
with "range" as
(
SELECT 0 as r_from,1 as r_to
UNION ALL
SELECT power(10, steps) * v ,
power(10, steps) * v + power(10, steps)
FROM generate_series(1, 9, 1) v
CROSS JOIN generate_series(0, 3, 1) steps
)
select r_from, r_to,(SELECT count(*) FROM "my_table" WHERE "my_value" BETWEEN r_from AND r_to) as "Anz."
FROM "range";
sqlifddle

Rather than generate_series you could create defined integer range types (int4range), then test whether your value is included within the range (see Range/Multirange Functions and Operators. So
with ranges (range_set) as
( values ( int4range(0,10,'[)') )
, ( int4range(10,100,'[)') )
, ( int4range(100,1000,'[)') )
, ( int4range(1000,10000,'[)') )
) --select * from ranges;
select lower(range_set) range_start
, upper(range_set) - 1 range_end
, count(my_value) cnt
from ranges r
left join my_table mt
on (mt.my_value <# r.range_set)
group by r.range_set
order by lower(r.range_set);
Note the 3rd parameter in creating the ranges.
Creating a CTE as above is good if your ranges are static, however if dynamic ranges are required you can put the ranges into a table. Changes ranges then becomes a matter to managing the table. Not simple but does not require code updates. The query then reduces to just the Main part of the above:
select lower(range_set) range_start
, upper(range_set) - 1 range_end
, count(my_value) cnt
from range_tab r
left join my_table mt
on (mt.my_value <# r.range_set)
group by r.range_set
order by lower(r.range_set);
See demo for both here.

Select distinct where any of the filters are true

Hello how do you select distinct rows when any of the filters return true?
Here is my Statement but it is returning a Cartesian result set of many thousands of duplicate rows. I don't want duplicate rows.
SELECT
Distinct
r.DRAWING
, r.[DESC]
, r.CF3 as rCF3
, e.OP_PSI
From thk t
left join eng e on e.DRAWING = t.DRAWING
left join ref r on r.DRAWING = t.DRAWING
where t.SurveyNumber = #SurveyNumber
or CAST(e.L_RATE AS DECIMAL(10,0)) >= 14
or CAST(e.S_RATE AS DECIMAL(10,0)) >= 14
or (YEAR(GETDATE()) - YEAR(e.REPLDATE)) <= 2
or CAST(e.WALL_LOSS AS DECIMAL(10,2)) >= .30
or CAST(e.RMS AS DECIMAL(10,0)) <= 25
or t.CF1 = 'AI'
ORDER BY r.DRAWING;

count using subqueries in T-sql

The following is my query. I need to get the count of the doctor's visits for each patient in the query. The count isn't right and it's printing 2 rows for each patient.
SELECT
pf.PatientId
, p.Visit
, pf.first
, pf.last
, df.first
, df.last
, doc.reconcile_status
, doc.orderid
, count(p.visit)
FROM [CentricityPS].[dbo].[PatientVisit] p
, [CentricityPS].[dbo].[document] doc
, [CentricityPS].[dbo].[Patientprofile] pf
, [CentricityPS].[dbo].[doctorfacility] df
where df.pvid in ('1507023132004420', '1725527248154950', '1406648461000690')
and p.doctorid = df.DoctorFacilityId
and p.patientprofileid = pf.patientprofileid
and pf.pid = doc.pid
and pf.patientstatusmid = '-900'
and pf.PatientProfileId = p.PatientProfileId
-- and pf.PatientId = '8145'
-- and p.visit >= '2016-01-01' and p.visit <= '2016-07-01'
and not exists (select * from [CentricityPS].[dbo].[PatientVisit] p
where (p.visit > '2013-01-01' and p.visit < = '2016-01-01')
and p.patientprofileid = pf.patientprofileid and pf.patientstatusmid not in (-901) )
and not exists (select * from [CentricityPS].[dbo].[PatientVisit] p
where p.visit <= '2013-01-01'
and p.patientprofileid = pf.patientprofileid and pf.patientstatusmid not in (-901) )
-- and pf.patientid = '100293'
group by df.DoctorFacilityId, pf.PatientId, p.visit, pf.first, pf.last, df.first, df.last, doc.RECONCILE_STATUS, doc.ORDERID, p.PatientProfileId
order by df.doctorfacilityid, pf.patientid, p.visit desc
What am I doing wrong?
Help!!!

You're grouping on too many fields. If you just need the count of doctor's visits for each patient, only SELECT PatientProfile fields along with count(p.visit) and just include the same PatientProfile fields in the GROUP BY.

Group By on a calculated field using T-SQL

I have the following code:
select distinct m.property_id,
m.property_size,
count(f.request_id) WOs,
round(cast(m.[property_size] as float(10)) / cast((f.[request_id] * 1000) as float(5)),2) as PER_1k_SQFT
from T1 m, T2 f
where m.property_id = f.property_id
and datepart(year,request_date) = '2015'
and datepart(month, f.request_date) = '12'
group by m.property_id, m.property_size, round(cast(m.[property_size] as float(10)) / cast((f.[request_id] * 1000) as float(5)),2)
order by count(f.request_id) desc
My last column, PER_SQ_Ft is all zeros. Why isn't it populating the result of the calculation?

T-sql Percent calculation stuffed with WHERE clauses doesn't work

I have t-sql as follows:
SELECT (COUNT(Intakes.fk_ClientID) * 100) / (
SELECT count(*)
FROM INTAKES
WHERE Intakes.AdmissionDate >= #StartDate
)
FROM Intakes
WHERE Intakes.fk_ReleasedFromID = '1'
AND Intakes.AdmissionDate >= #StartDate;
I'm trying to get the percentage of clients who have releasedfromID = 1 out of a subset of clients who have a certain range of admission dates. But I get rows of 1's and 0's instead. Now, I can get the percentage if I take out the where clauses, it works:
SELECT (COUNT(Intakes.fk_ClientID) * 100) / (
SELECT count(*)
FROM INTAKES
)
FROM Intakes
WHERE Intakes.fk_ReleasedFromID = '1';
works fine. It selects ClientIDs where ReleasedFromID =1, multiplies it by 100 and divides by total rows in Intakes. But how do you run percentage with WHERE clauses as above?

After reading comment from #Anssssss
SELECT (COUNT(Intakes.fk_ClientID) * 100.0) / (
SELECT count(*)
FROM INTAKES
) 'percentage'
FROM Intakes
WHERE Intakes.fk_ReleasedFromID = '1';

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Postgres DISTINCT Query issue - postgresql

Related

Is there a smarter method to create series with different intervalls for count within a query?

Select distinct where any of the filters are true

count using subqueries in T-sql

Group By on a calculated field using T-SQL

T-sql Percent calculation stuffed with WHERE clauses doesn't work

Categories

Resources