How to get information on agregate object with Postgres - postgresql

I have a request that use a group by to get the MIN distance between two object.
WITH point_interet AS (
SELECT pai.ogc_fid as p1id, pai2.ogc_fid as p2id, MIN(ST_Distance(pai.geom, pai2.geom)) AS distance
FROM point_activite_interet pai
JOIN point_activite_interet pai2 ON pai.ogc_fid > pai2.ogc_fid
GROUP BY pai.ogc_fid)
SELECT * FROM point_interet
ORDER BY distance DESC;
This doesn't work because Postgres says p2id should be in GROUP BY clause actually it is not true, I would like to know what is the object the closest from pai.ogc_fid.
Do you have any idea how I should do that?

I think you just want to remove the min() aggregate
Alternatively you could add a window specification, such as min(...) OVER (ROWS UNBOUNDED PRECEDING) which would allow you to do this, but I don't think this is what you want because it will give you the minimum distance of any preceding row compbination.....

Related

SQLite: Using MAX of GROUP BY averages to compute percentage of all the other averages relative to that MAX

I have four groups of people, and they each have an average for a given metric. The following query would yield four values, one for each group.
SELECT group, AVG(metric) AS 'avg_metric'
FROM table
GROUP BY group
Now one of those averages will be the max. I want to also capture the avg_metric / MAX(avg_metric) in my SELECT statement. Since I can't use MAX and AVG(MAX...) in the same query, I thought something like this would work:
SELECT group, (100 * sub.avg_metric / MAX(sub.avg_metric)) AS 'percentage'
FROM table
JOIN (SELECT AVG(metric) AS 'avg_metric'
FROM table
GROUP BY group
ON table.group = sub.group) AS sub
Unfortunately, I can't seem to get the syntax right. (I'm using SQLite.) Additionally, it would be much nicer to have
GROUP AVG_METRIC PERCENTAGE
group 1 57 45
group 2 ....
In that order, but I don't see how to do that either.

How to get SUM and AVG from a column in PostgreSQL

Maybe I'm overlooking something, but none of the answers I found solve my problem. I'm trying to get the sum and average from a column, but everything I see is getting sum and average from a row.
Here is the query I'm using:
SELECT product_name,unit_cost,units,total,SUM(total),AVG(total)
FROM products
GROUP BY product_name,unit_cost,total
And this is what I get:
It returns the exact same amounts. What I need is to add all values in the unit_cost column and return the SUM and AVG of all its values. What am I missing? What did I not understand? Thank you for taking the time to answer!
AVG and SUM as window functions and no grouping will do the job.
select product_name,unit_cost,units,total,
SUM(total) over all_rows as sum_of_all_rows,
AVG(total) over all_rows as avg_of_all_rows
from products
window all_rows as ();
The groups in your query contain just one row if total is a distinct value. This seems to be the case with your example. You can check this with a count aggregate (value = 1).
Removing total and probably unit_cost) from your select and group by clause should help.

What does ORDER BY within OVER() window function mean in postgresql?

I am trying to understand how the ORDER BY clause in the OVER() window function is different from ORDER BY clause in generic SQL.
I was solving the following problem: https://www.pgexercises.com/questions/aggregates/nummembers.html
Produce a monotonically increasing numbered list of members (including guests),
ordered by their date of joining. Remember that member IDs are not guaranteed to be sequential.
The following query is one of the accepted solutions:
SELECT COUNT(*) OVER(ORDER by joindate), firstname, surname FROM cd.members;
As per my understanding, since we are not supplying a PARTITION BY clause in the OVER() function, all the rows in cd.members table form one big partition (let's call it X). When the window function runs, it should order X by joindate, and then COUNT(*) on X would return the number of rows in X which is just the number of rows in cd.members.
But this understanding is incorrect. The 'Answers and Discussion' accompanying the aforementioned problem states:
Since we define an order for the window function, for any given row the window is: start of the dataset -> current row.
The PG documentation on window function states:
You can also control the order in which rows are processed by window functions using ORDER BY within OVER. (The window ORDER BY does not even have to match the order in which the rows are output.)
What I cannot comprehend is why will ORDER BY inside the OVER() stop at the current row? Could you please elaborate how this is working?
Thank you for reading through.
I don't know what to add beyond what the docs (same page as you already linked to) already say:
By default, if ORDER BY is supplied then the frame consists of all
rows from the start of the partition up through the current row, plus
any following rows that are equal to the current row according to the
ORDER BY clause.
I don't now if this required by the SQL standard, but it certainly seems reasonable. Why specify an ORDER BY if you expect it to have no observable effect?

Accessing current row value with lag function

I want to calculate the difference between the previous and the current column and make it a new column named increase. For this, I'm using the lag window function. The value of the first column is not defined since no previous column exists. I know that a 3rd parameter specifies the default value. However, it depends. For the first row, I want to use the value of another column e.g. the one of count from that current row. This assumes that 0 is increased to count for the first row which is what I need. Specifying the column name as 3rd argument for the lag function does not work correctly and neither does using 0. How can it be done? I'm getting strange results such as quite a random result or even negative numbers.
SELECT *, mycount - lag(mycount, 1) OVER (ORDER BY id, messtime ASC) AS increase FROM measurements;
Window functions cannot be nested either:
ERROR: window function calls cannot be nested
There is another issue with your query: So far your results are in random order, so you may think you are seeing problems that don't exist.
Add ORDER BY id, messtime to your query to see the rows in order. Now you can compare one row with its predecessor directly. Are there still issues? If so, which exactly?
SELECT *, "count" - lag("count", 1) OVER (ORDER BY id, messtime) AS increase
FROM measurements
ORDER BY id, messtime;
COUNT is a reserved word in SQL. It seems the DBMS thinks you want to nest COUNT and LAG somehow.
Use another column name or use quotes for the column:
SELECT *, "count" - lag("count", 1) OVER

n-th row in PostgreSQL for p-quantile

I'm trying to fetch the n-th row of a query result. Further posts suggested the use of OFFSET or LIMIT but those forbid the use of variables (ERROR: argument of OFFSET must not contain variables). Further I read about the usage of cursors but I'm not quite sure how to use them even after reading their PostgreSQL manpage. Any other suggestions or examples for how to use cursors?
My main goal is to calculate the p-quantile of a row and since PostgreSQL doesn't provide this function by default I have to write it on my own.
Cheers
The following returns the 5th row of a result set:
select *
from (
select <column_list>,
row_number() over (order by some_sort_column) as rn
) t
where rn = 5;
You have to include an order by because otherwise the concept of "5th row" doesn't make sense.
You mention "use of variable" so I'm not sure what you are actually trying to achive. But you should be able to supply the value 5 as a variable for this query (or even a sub-select).
You might also want to dig further into windowing functions. Because with that you could e.g. do a sum() over the 3 rows before the current row (or similar constructs) - which could also be useful for you.
if you would like to get 10th record, below query also work fine.
select * from table_name order by sort_column limit 1 offset 9
OFFSET simply skip that many rows before beginning to return rows as mentioned in LIMIT clause.