postgres how to properly use max function - postgresql

Help please with correct usage of max, I have the following:
select busqueda.valorBusqueda, count(*) from busqueda where usu_id = 24 group by busqueda.valorBusqueda;
and it works, but I want only the max count of it, so far I tried:
select max (busqueda.valorBusqueda, count(*) from busqueda where usu_id = 24 group by busqueda.valorBusqueda);
but no success..

The easiest solution here is probably to use a LIMIT query:
select valorBusqueda, count(*) as cnt
from busqueda
where usu_id = 24
group by valorBusqueda
order by count(*) desc
limit 1;
Postgres does not support ties with LIMIT, but we can use the RANK analytic function here if you do want all ties for the highest count:
with cte as (
select valorBusqueda, count(*) as cnt, rank() over (order by count(*) desc) rnk
from busqueda
where usu_id = 24
group by valorBusqueda
)
select valorBusqueda, cnt
from cnt
where rnk = 1;

you can use subquery.
select max(t.ct) from (
select busqueda.valorBusqueda, count(*) ct from busqueda
where usu_id = 24 group by busqueda.valorBusqueda) t;

Related

Postgres select work 3x time faster then function with that select

I have a SELECT in Postgres:
SELECT DISTINCT ON (price) price, quantity, is_ask, final_update_id
FROM (SELECT *
FROM ((SELECT price, quantity, is_ask, book_depth.final_update_id
FROM order_depth
LEFT JOIN book_depth ON book_depth_id = book_depth.id
WHERE book_depth_id IN (SELECT id
FROM book_depth
WHERE final_update_id > (SELECT last_update_id
FROM order_book
WHERE symbol_name = 'XRPRUB'
ORDER BY last_update_id DESC
LIMIT 1)
AND symbol_name = 'XRPRUB'))
UNION
(SELECT price, quantity, is_ask, order_book_id
FROM "order"
WHERE order_book_id = (SELECT id
FROM order_book
WHERE symbol_name = 'XRPRUB'
ORDER BY last_update_id DESC
LIMIT 1))
ORDER BY final_update_id DESC) AS t) AS t1
ORDER BY price, final_update_id DESC;
It works for about 20 seconds.
But when I create function with this select this function works for about 1 min 40 seconds. Can someone explain me is it normal or I make mistake somewhere?

Postgres: Date grouping in Subquery from timestamp

I am trying find out how many leads are generated per listing per day.
I have this query:
SELECT
vl.listing_id,
vl.created_at::date as dt,
(
SELECT count(*)
FROM voice_leads vl2
WHERE vl2.listing_id = vl.listing_id
AND vl.created_at::date = vl2.created_at::date
) as cnt
FROM voice_leads vl
GROUP BY listing_id, vl.created_at::date
ORDER BY listing_id
but when executing I get "ERROR: subquery uses ungrouped column "vl.created_at" from outer query LINE 8: AND vl.created_at::date = vl2.created_at::date"
Any idea on what I could do to fix it?
SELECT
vl.listing_id,
vl.created_at::date as dt,
count(cnt.*)
FROM voice_leads vl, lateral (SELECT *
from voice_leads vl2
WHERE vl2.listing_id = vl.listing_id
AND vl.created_at::date = vl2.created_at::date) cnt
GROUP BY vl.listing_id, vl.created_at::date
ORDER BY listing_id
You don't need the subquery:
SELECT
vl.listing_id,
vl.created_at::date as dt,
count( vl.listing_id ) as cnt
FROM voice_leads vl
GROUP BY listing_id, vl.created_at::date
ORDER BY listing_id
should do the same.
count(field) will count the number of rows in each group.
count(*) will count the total number of rows.

Limit by percent instead of number of rows without subqueries

I would like to select the top 1% of rows; however, I cannot use subqueries to do it. I.e., this won't work:
SELECT * FROM mytbl
WHERE var='value'
ORDER BY id,random()
LIMIT(SELECT (COUNT(*) * 0.01)::integer FROM mytbl)
How would I accomplish the same output without using a subquery with limit?
You can utilize PERCENT_RANK:
WITH cte(ID, var, pc) AS
(
SELECT ID, var, PERCENT_RANK() OVER (ORDER BY random()) AS pc
FROM mytbl
WHERE var = 'value'
)
SELECT *
FROM cte
WHERE pc <= 0.01
ORDER BY id;
SqlFiddleDemo
I solved it with Python using the psycopg2 package:
cur.execute("SELECT ROUND(COUNT(id)*0.01,0)
FROM mytbl")
nrows = str([int(d[0]) for d in cur.fetchall()][0])
cur.execute("SELECT *
FROM mytbl
WHERE var='value'
ORDER BY id, random() LIMIT (%s)",nrows)
Perhaps there is a more elegant solution using just SQL, or a more efficient one, but this does exactly what I'm looking for.
If I got it right, you need:
Random 1% sample of all rows,
If some id is within the sample, all rows with the same id must be there too.
The follow sql should do the trick:
with ids as (
select id,
total,
sum(cnt) over (order by max(rnd)) running_total
from (
select id,
count(*) over (partition by id) cnt,
count(*) over () total,
row_number() over(order by random()) rnd
from mytbl
) q
group by id,
cnt,
total
)
select mytbl.*
from mytbl,
ids
where mytbl.id = ids.id
and ids.running_total <= ids.total * 0.01
order by mytbl.id;
I don’t have your data, of course, but I have no trouble using a sub query in the LIMIT clause.
However, the sub query contains only the count(*) part and I then multiply the result by 0.01:
SELECT * FROM mytbl
WHERE var='value'
ORDER BY id,random()
LIMIT(SELECT count(*) FROM mytbl)*0.01;

TSQL Compare 2 select's result and return result with most recent date

Wonder if someone could give me a quick hand. I have 2 select queries (as shown below) and I want to compare the results of both and only return the result that has the most recent date.
So say I have the following 2 results from the queries:-
--------- ---------- ----------------------- --------------- ------ --
COMPANY A EMPLOYEE A 2007-10-16 17:10:21.000 E-mail 6D29D6D5 SYSTEM 1
COMPANY A EMPLOYEE A 2007-10-15 17:10:21.000 E-mail 6D29D6D5 SYSTEM 1
I only want to return the result with the latest date (so the first one). I thought about putting the results into a temporary table and then querying that but just wondering if there's a simpler, more efficient way?
SELECT * FROM (
SELECT fc.accountidname, fc.owneridname, fap.actualend, fap.activitytypecodename, fap.createdby, fap.createdbyname,
ROW_NUMBER() OVER (PARTITION BY fc.accountidname ORDER BY fap.actualend DESC) AS RN
FROM FilteredContact fc
INNER JOIN FilteredActivityPointer fap ON fc.parentcustomerid = fap.regardingobjectid
WHERE fc.statecodename = 'Active'
AND fap.ownerid = '0F995BDC'
AND fap.createdon < getdate()
) tmp WHERE RN = 1
SELECT * FROM (
SELECT fa.name, fa.owneridname, fa.new_technicalaccountmanageridname, fa.new_customerid, fa.new_riskstatusname,
fa.new_numberofopencases, fa.new_numberofurgentopencases, fap.actualend, fap.activitytypecodename, fap.createdby, fap.createdbyname,
ROW_NUMBER() OVER (PARTITION BY fa.name ORDER BY fap.actualend DESC) AS RN
FROM FilteredAccount fa
INNER JOIN FilteredActivityPointer fap ON fa.accountid = fap.regardingobjectid
WHERE fa.statecodename = 'Active'
AND fap.ownerid = '0F995BDC'
AND fap.createdon < getdate()
) tmp2 WHERE RN = 1
if the tables have the same structure (column count and column types to match), then you could just union the results of the two queries, then order by the date desc and then select the top 1.
select top 1 * from
(
-- your first query
union all
-- your second query.
) T
order by YourDateColumn1 desc
You should GROUP BY and use MAX(createdon)

How to update a table with a list of values at a time?

I have
update NewLeaderBoards set MonthlyRank=(Select RowNumber() (order by TotalPoints desc) from LeaderBoards)
I tried it this way -
(Select RowNumber() (order by TotalPoints desc) from LeaderBoards) as NewRanks
update NewLeaderBoards set MonthlyRank = NewRanks
But it doesnt work for me..Can anyone suggest me how can i perform an update in such a way..
You need to use the WITH statement and a full CTE:
;With Ranks As
(
Select PrimaryKeyColumn, Row_Number() Over( Order By TotalPoints Desc ) As Num
From LeaderBoards
)
Update NewLeaderBoards
Set MonthlyRank = T2.Num
From NewLeaderBoards As T1
Join Ranks As T2
On T2.PrimaryKeyColumn = T1.PrimaryKeyColumn