Find time difference between two most recent orders - postgresql

I am trying to estimate the time of a new order from repeat customers by finding the time difference between the most recent order and the second most recent order, and then adding that difference to the most recent order.
I have been trying limit and offset, but this returns a blanket date for every row. I am thinking I need to do a lateral join, but not sure how to implement it correctly. When I try to do it, I receive no output.
select public.orders.customer_id,
max(public.orders.created_at) as last_order_date,
(select created_at from public.orders group by created_at order by created_at desc limit 1 offset 1) as second_last
from public.orders
inner join
(select
customer_id, count(*)
from public.orders
where status = 'fulfilled'
group by public.orders.customer_id
having count(customer_id) >1) repeat_customers
on public.orders.customer_id = repeat_customers.customer_id
group by public.orders.customer_id;
I wanted the second_last field to be populated by the second most recent date for each customer_id, but the output is the second most recent date for the entire table, resulting in the same date for every entry.

For your second_last column you're not limiting it per customer, it will indeed find the max of everything just like the results you've seen. See the WHERE clause in the example below which should solve this:
(SELECT
created_at
FROM
public.orders po
WHERE
po.customer_id = customer_id
ORDER BY
created_at
LIMIT 1 OFFSET 1) AS second_last
I've also aliased the table because I wasn't sure if it would complain about ambiguity since the same table is mentioned in the main select.

Related

postgresql: how to get the last record even with WHERE clause

I have the following postgresql command
SELECT *
FROM (
SELECT *
FROM tablename
ORDER by id DESC
LIMIT 1000
) as t
WHERE t.col1="someval"
Now i also want to get the last record of along with the above query
FROM (
SELECT *
FROM tablename
ORDER by id DESC
LIMIT 1000
)
Currently i am doing
SELECT *
FROM (
SELECT *
FROM tablename
ORDER by id DESC
LIMIT 1000
) as t
WHERE t.col1="someval"
UNION ALL
SELECT *
FROM (
SELECT *
FROM tablename
ORDER by id DESC
LIMIT 1000
) as t
ORDER BY id ASC
LIMIT 1
Is this is the right way
I would use UNION rather than UNION ALL in this case, since the final row could also be returned by the first query, and I wouldn't want to have it twice in the result set if that happens. The primary key will guarantee that UNION can accidentally remove duplicate result rows.
I don't understand the query, in particular why there is a WHERE condition at the outside query in the first case, but not in the second. But that is unrelated to the question.
Your current effort is wrong, since the LIMIT 1 applies outside the UNION ALL, so you get only one row as a result. That this is wrong should have been immediately obvious upon testing, so it is baffling that you are asking us if it is right.
You should wrap the whole second SELECT in parenthesis, so the LIMIT applies just to it.
Better yet, rather than ordering and taking 1000 rows and then reversing the order and taking the first row, you could just do OFFSET 999 LIMIT 1 to get the 1000th row.
If the 1000th rows matches both conditions, do you want to see it twice?

delete all but two sorted items postgresql

In my structure I have the following, I would like to keep (yellow) the most recent dates and delete the remaining? I don't necessary know the most recent date (ie 17/4/2021 and 10/2/2021 in my example) for each stock_id but I know I want to keep only the two most recent items.
Is that possible?
Thank you
Note: this assumes that dates do not repeat within each stock_id group in your table, so top two dates are always unique.
You can assign rank to each row within stock_id after ordering by date and delete rows where rank is greater than 2.
DELETE FROM mytable
WHERE (stock_id, date) NOT IN (
SELECT
stock_id,
date
FROM (
SELECT
stock_id,
date,
row_number() over (partition by stock_id order by date desc) as rank
FROM mytable
) ranks
WHERE rank <= 2
)

select last of an item for each user in postgres

I want to get the last entry for each user but the customer_id is a hash 'ASAG#...' order by customer_id destroys the query. Is there an alternative?
Select Distinct On (l.customer_id)
l.customer_id
,l.created_at
,l.text
From likes l
Order By l.customer_id, l.created_at Desc
Your current query already appears to be working, q.v. here:
Demo
I don't know why your current query is not generating the results you would expect. It should return one distinct record for every customer, corresponding to the more recent one, given your ORDER BY statement.
In any case, if it does not do what you want, an alternative would be to use ROW_NUMBER() here with a partition by user. The inner query assigns a row number to each user, with the value 1 going to the most recent record for each user. Then the outer query retains only the latest record.
SELECT
t.customer_id,
t.created_at,
t.text
FROM
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY created_at DESC) rn
FROM likes
) t
WHERE t.rn = 1
To speed up the inner query which uses ROW_NUMBER() you can try adding a composite index on the customer_id and created_at columns:
CREATE INDEX yourIdx ON likes (customer_id, created_at);

SQL SELECT in another table with most recent date

I have a list of Matter data in Table1 that I need to query, as well as get the most recent Invoice Number in Table2 that is tied to the original Matter. I'm having extreme difficulty in joining these tables together and only getting one result for each Matter as I only want the most recent Invoice #.
Any and all help would be greatly appreciated.
Table1
Table2
RESULT
The following assigns numbers to each invoice row in order of date, and selects only the most recent. Note that this assumes InvoiceDate is stored as a date,datetime, or something else that will sort chronologically, and that in the event of two invoices for the same date, returning either will be fine. If you need to return both invoices in the event of ties, replace row_number with rank.
Select * from Table1 a
inner join
(Select *
, row_number() over (partition by MatterID order by InvoiceDate desc) as RN
from Table2) b
on a.MatterID = b.MatterID and b.RN = 1

Calculate previous order date and status in Postgres

I have a simple table of orders, and I need to calculate some stats for each order. Essentially I have a Postgres db with fields:
Order_ID (unique), User_ID, Created_at (date), City, Total
I want to write a query that will generate, for each Order_ID:
1) the Created_at date of the user's most recent order prior to the current Order_ID (so if a customer placed order with Order_ID=200005b on 9/20/14, what is the date of that user's most recent previous order?)
2) another field showing a user's "Status" based on this date, given the following cases:
-- if this is user's first order, Status="new";
-- if most recent previous order date <= 60 days before the given/current order, Status="active";
-- if most recent previous order date > 60 days before the given/current order, Status="reactivated"
I think there's a way to write this query using some nested SELECTS, and maybe a self-join, but I don't know PostgreSQL well enough to understand the ordering of queries. I have been able to generate an "Order_N" field using the following query that I could use to lookup (Order_N)-1 to find the date, but I get stuck once trying to use that in nesting.
SELECT
user_id,
order_id,
created_at,
row_number() over (partition by user_id order by created_at ) as order_n
order by user_id, created_at;
Does anyone have any ideas?