Select max of one date after getting max of other date - date

I am using Oracle's PSQuery tool so I do not have access to the actual code. I would like to generate a list of students with their maximum dropped course . However, if the student has more than one dropped course on the same max date, I want to return the one with the max deadline date. So, all max drop dates with duplicates narrowed down further by max deadline date.

SELECT s.studentid, s.name, MAX(s.date), (s.deadlineDate), MAX(d.droppedCourse) AS droppedCourse
FROM Student s
JOIN DeadlineDate d ON s.studentid = d.studentid
GROUP BY s.studentid, s.name, s.date
HAVING MAX(d.DeadlineDate);

Related

Is it possible to use AVG function to give the average of 2 results from within a sub-query that has a UNION?

So I have written a UNION query and then, in order to amalgamate the 2 averages from both sides of the union, I have put the whole union query inside a select query and given it an alias, this isn't the whole thing, but gets the point across I think(?):
select supplier, year, month, avg(average) as average from
(select supplier, year, month, avg(age(tableA.date, tableB.date)) as
average
from tableA join tableB using(supplier)
group by supplier, year, month
UNION ALL
select supplier, year, month, avg(age(tableC.date, tableB.date)) as
average
from tableC join tableB using(supplier)
group by supplier, year, month
) as x
group by supplier, year, month
I have a Maths graduate telling me that you simply can't average an average, but having looked at the data I think that the outer query is treating the average from the inner query as a single amount of time for each inner query and is therefore allowing me to average it outside as though it hasn't already been averaged, if that makes any kind of sense??
Any perspectives on this very welcome.

delete all but two sorted items postgresql

In my structure I have the following, I would like to keep (yellow) the most recent dates and delete the remaining? I don't necessary know the most recent date (ie 17/4/2021 and 10/2/2021 in my example) for each stock_id but I know I want to keep only the two most recent items.
Is that possible?
Thank you
Note: this assumes that dates do not repeat within each stock_id group in your table, so top two dates are always unique.
You can assign rank to each row within stock_id after ordering by date and delete rows where rank is greater than 2.
DELETE FROM mytable
WHERE (stock_id, date) NOT IN (
SELECT
stock_id,
date
FROM (
SELECT
stock_id,
date,
row_number() over (partition by stock_id order by date desc) as rank
FROM mytable
) ranks
WHERE rank <= 2
)

Find time difference between two most recent orders

I am trying to estimate the time of a new order from repeat customers by finding the time difference between the most recent order and the second most recent order, and then adding that difference to the most recent order.
I have been trying limit and offset, but this returns a blanket date for every row. I am thinking I need to do a lateral join, but not sure how to implement it correctly. When I try to do it, I receive no output.
select public.orders.customer_id,
max(public.orders.created_at) as last_order_date,
(select created_at from public.orders group by created_at order by created_at desc limit 1 offset 1) as second_last
from public.orders
inner join
(select
customer_id, count(*)
from public.orders
where status = 'fulfilled'
group by public.orders.customer_id
having count(customer_id) >1) repeat_customers
on public.orders.customer_id = repeat_customers.customer_id
group by public.orders.customer_id;
I wanted the second_last field to be populated by the second most recent date for each customer_id, but the output is the second most recent date for the entire table, resulting in the same date for every entry.
For your second_last column you're not limiting it per customer, it will indeed find the max of everything just like the results you've seen. See the WHERE clause in the example below which should solve this:
(SELECT
created_at
FROM
public.orders po
WHERE
po.customer_id = customer_id
ORDER BY
created_at
LIMIT 1 OFFSET 1) AS second_last
I've also aliased the table because I wasn't sure if it would complain about ambiguity since the same table is mentioned in the main select.

Condition and max reference in redshift window function

I have a list of dates, accounts, and sources of data. I'm taking the latest max date for each account and using that number in my window reference.
In my window reference, I'm using row_number () to assign unique rows to each account and sources of data that we're receiving and sorting it by the max date for each account and source of data. The end result should list out one row for each unique account + source of data combination, with the max date available in that combination. The record with the highest date will have 1 listed.
I'm trying to set a condition on my window function where only rows that populate with 1 are listed in the query, while the other ones are not shown at all. This is what I have below and where I get stuck:
SELECT
date,
account,
data source,
MAX(date) max_date,
ROW_NUMBER () OVER (PARTITION BY account ORDER BY max_date) ROWNUM
FROM table
GROUP BY
date,
account,
data source
Any help is greatly appreciated. I can elaborate on anything if necessary
If I understood your question correctly this SQL would do the trick
SELECT
date,
account,
data source,
MAX(date) max_date
FROM (
SELECT
date,
account,
data source,
MAX(date) max_date,
ROW_NUMBER () OVER (PARTITION BY account ORDER BY max_date) ROWNUM
FROM table
GROUP BY
date,
account,
data source
)
where ROWNUM = 1
If you do not need the row number for anything other than uniqueness then a query like this should work:
select distinct t.account, data_source, date
from table t
join (select account, max(date) max_date from table group by account) m
on t.account=m.account and t.date=m.max_date
This can still generate two records for one account if two records for different data sources have the identical date. If that is a possibility then mdem7's approach is probably best.
It's a bit unclear from the question but if you want each combination of account and data_source with its max date making sure there are no duplicates, then distinct should be enough:
select distinct account, data_source, max(date) max_date
from table t
group by account, data_source

Getting a count of number of rows matching MAX() value in Postgres

I have a table called 'games' that has a column in it called 'week'. I am trying to find a single query that will give me the maximum value for 'week' along with a count of how many rows in that table have the maximum value for 'week'. I could split it up into two queries:
SELECT MAX(week) FROM games
// store value in a variable $maxWeek
SELECT COUNT(1) FROM games WHERE week = $maxWeek
// store that result in a variable
Is there a way to do this all in one query?
SELECT week, count(*) FROM games GROUP BY week ORDER BY week DESC LIMIT 1;
or
SELECT week, count(*) FROM games WHERE week = (SELECT max(week) FROM games) GROUP BY week;
(may be faster)