I'm trying to figure out how I can list out the most recent invoice date & invoice number for my customers. I have the basic structure as follows but can't figure out how to get this data. The two tables use the customer_id as the relationship
SELECT
c.customer_id
c.customer_name
( ? get most recent invoice number)
(?get most recent invoice date)
FROM
customer AS c
JOIN invoice as i on i.customer_id=c.customer_id
There are 2 common methods you can use. Common Table Expression (CTE) to rank all rows per customer_id order by invoice_date descending, and then you pick up date and number of the row ranked #1, or using CROSS APPLY to pick TOP 1 invoice per customer_id again, order by invoice_date descending.
Something like this
-- method #1 using CTE
;WITH cte AS (
SELECT
c.customer_id,
c.customer_name,
i.invoice_number,
i.invoce_date,
ROW_NUMBER() OVER (PARTITION BY c.Customer_id ORDER BY i.invoce_date DESC) rn
FROM
CUSTOMER c
INNER JOIN invoice i on i.customer_id=c.customer_id
)
SELECT cte.customer_id,
cte.customer_name,
cte.invoice_number,
cte.invoice_date
FROM cte
WHERE rm = 1
-- method #2 using CROSS APPLY
SELECT
c.customer_id,
c.customer_name,
ca.invoice_number,
ca.invoice_date
FROM
customer c
CROSS APPLY (
SELECT TOP 1 invoice_number, invoice_date
FROM invoice as i WHERE i.customer_id=c.customer_id
ORDER BY invoice_date DESC
) ca
Your requirement to list out the most recent invoice date & invoice number per each customer could be translated to: for each customer, retrieve the invoice record that there is not exists other invoice of the same customer which has more recent invoice date. The corresponding query should be:
SELECT
c.customer_id,
c.customer_name,
i.invoice_number,
i.invoice_date
FROM
customer c
INNER JOIN invoice i ON i.customer_id = c.customer_id
WHERE NOT EXISTS
(SELECT 1
FROM invoice i1
WHERE i1.invoice_number <> i.invoice_number
AND i1.customer_id = i.customer_id
AND i1.invoice_date > i.invoice_date)
Related
SELECT partner_id
FROM trip_delivery_sales ts
WHERE ts.route_id='152'
GROUP BY ts.partner_id
From the query we can get the partners id.Using that partner id we want check in trip delicery sales lines table and want to find each customer last two sale product quantity sum. If last two sale have product qty as 2 & 5 want result as partner_id | count as Mn2333 - 7
here fore example i take partner id as 34806. But i want to check all partner_id obtained from last query
SELECT product_qty
FROM trip_delivery_sales_lines td
WHERE td.partner_id='34806'
AND td.route_id='152'
AND td.product_id='432'
ORDER BY td.order_date DESC
LIMIT 2
You can run this query
SELECT td.partner_id,sum(product_qty)
FROM trip_delivery_sales_lines td,
(SELECT partner_id FROM trip_delivery_sales ts WHERE ts.route_id='152') as ts
WHERE td.partner_id=ts.partner_id
AND td.product_id='432'
GROUP BY td.partner_id
ORDER BY td.order_date DESC
LIMIT 2
Or this one
with ts as (SELECT distinct partner_id FROM trip_delivery_sales WHERE route_id='152')
SELECT td.partner_id,sum(product_qty)
FROM trip_delivery_sales_lines td,ts
WHERE td.partner_id=ts.partner_id
AND td.product_id='432'
GROUP BY td.partner_id
ORDER BY td.order_date DESC
LIMIT 2
You might be looking for
SELECT DISTINCT ts.partner_id, ARRAY(
SELECT product_qty
FROM trip_delivery_sales_lines td
WHERE td.partner_id=ts.partner_id
AND td.product_id='432'
ORDER BY td.order_date DESC
LIMIT 2
) AS product_qty_arr
FROM trip_delivery_sales ts
WHERE ts.route_id='152'
or just
SELECT
partner_id,
array_agg(product_qty ORDER BY order_date DESC) as product_qty_arr
FROM (
SELECT
td.partner_id,
td.product_qty,
td.order_date,
row_number() OVER (PARTITION BY td.partner_id ORDER BY td.order_date DESC)
FROM trip_delivery_sales_lines td
JOIN trip_delivery_sales ts USING (partner_id)
WHERE ts.route_id='152'
AND td.product_id='432'
) AS enumerated
WHERE row_number <= 2
GROUP BY partner_id
See also PostgreSQL: top n entries per item in same table or Optimize GROUP BY query to retrieve latest row per user
Hi guys I have two tables dbo.Sales (customer_id, order_date, product_id) and dbo.Menu (Product_id, product_name, price). The question is
What was the first item from the menu purchased by each customer?
My solution is
select A.customer_id,m.product_id, m.product_name
from dbo.menu m
cross apply
(select top 1 * from dbo.sales s
where s.product_id=m.product_id
group by s.customer_id,s.order_date, s.product_id
order by s.order_date) A
customer_id product_id product_name
A 1 sushi
A 2 curry
C 3 ramen
Missing customer is B. Instead of B it gives me the second first order by A.
I need for each customer
Murat
You could use a ROW_NUMBER() window function to get the earliest product_id per customer and then join to the Menu table to get your product details.
Edit: Updated ORDER to ASC.
;with cte
as (
select customer_id, product_id, row_number() over (partition by customer_id order by order_date acs) RN
from dbo.Sales)
select c.customer_id, c.product_id, m.product_name
from cte c
join dbo.menu m on c.product_id=m.product_id
where RN = 1
SELECT distinct s.customer_id,
FIRST_VALUE(m.product_name) OVER (partition by s.customer_id order by order_date )
as FirstItem_Customer
FROM [dbo].[sales] S
join [dbo].[menu] M on M.product_id=s.product_id
I have two tables. One is Transactions and the other is Tickets. In Tickets I have the Ticket_Number,the name of the Category(Theater,Cinema,Concert), the Price of the Ticket. In Transactions I also have the Ticket_Number. What i want to do is to Get a SUM of money for each Category, and then with that data I want to Select the Category with the most money.
I already managed to get the SUM for each category but I am stuck here
SELECT category, SUM (Tickets.Price) AS Price
FROM Tickets,Transactions
WHERE Tickets.ticket_num=Transactions.ticket_num
GROUP BY Category
ORDER BY Price DESC;
I know i can add LIMIT 1 but I know it's not correct because 2 or more values can be the same
Using ROW_NUMBER to generate a sequence based on the sum of the price. Then, restrict to only the matching aggregated row with the highest total price.
WITH cte AS (
SELECT category, SUM(t1.Price) AS Price,
ROW_NUMBER() OVER (ORDER BY SUM(t1.Price) DESC) rn
FROM Tickets t1
INNER JOIN Transactions t2
ON t1.ticket_num = t2.ticket_num
GROUP BY Category
)
SELECT category, Price
FROM cte
WHERE rn = 1
ORDER BY Price DESC;
Note that if you want to capture all categories tied for the highest price, should a tie occur, then replace ROW_NUMBER in the above CTE with RANK, keeping everything else the same.
What you are looking for is a window function DENSE_RANK() which will handle ties properly.
RANK() will also work for your case, but if you would like to extend it to get TOP N places with ties (where N > 1), dense rank is the way to go.
SELECT Category, Price
FROM (
SELECT
Category,
SUM(ti.Price) AS Price,
DENSE_RANK() OVER (ORDER BY SUM(ti.Price) DESC) AS rnk
FROM Tickets ti
INNER JOIN Transactions tr ON
ti.ticket_num = tr.ticket_num
GROUP BY Category
) t
WHERE rnk = 1
I've also replaced the old style and not recommended joining of tables as comma separated list in FROM clause to a proper INNER JOIN clause and assigned aliases to tables.
You can use rank() to rank the sums of the prices, more expensive first.
SELECT category,
price
FROM (SELECT category,
sum(tickets.price) price,
rank() OVER (ORDER BY sum(tickets.price) DESC) r
FROM tickets
INNER JOIN transactions
ON transactions.ticket_num = tickets.ticket_num
GROUP BY category) x
WHERE r = 1;
I also took the liberty to rewrite your join from the ancient comma style to a modern, clearer version.
Let's say I have an orders table with customer_id, order_total, and order_date columns. I'd like to build a report that shows all customers who haven't placed an order in the last 30 days, with a column for the total amount their last order was.
This gets all of the customers who should be on the report:
select customer, max(order_date), (select order_total from orders o2 where o2.customer = orders.customer order by order_date desc limit 1)
from orders
group by 1
having max(order_date) < NOW() - '30 days'::interval
Is there a better way to do this that doesn't require a subquery but instead uses a window function or other more efficient method in order to access the total amount from the most recent order? The techniques from How to select id with max date group by category in PostgreSQL? are related, but the extra having restriction seems to stop me from using something like DISTINCT ON.
demo:db<>fiddle
Solution with row_number window function (https://www.postgresql.org/docs/current/static/tutorial-window.html)
SELECT
customer, order_date, order_total
FROM (
SELECT
*,
first_value(order_date) OVER w as last_order,
first_value(order_total) OVER w as last_total,
row_number() OVER w as row_count
FROM orders
WINDOW w AS (PARTITION BY customer ORDER BY order_date DESC)
) s
WHERE row_count = 1 AND order_date < CURRENT_DATE - 30
Solution with DISTINCT ON (https://www.postgresql.org/docs/9.5/static/sql-select.html#SQL-DISTINCT):
SELECT
customer, order_date, order_total
FROM (
SELECT DISTINCT ON (customer)
*,
first_value(order_date) OVER w as last_order,
first_value(order_total) OVER w as last_total
FROM orders
WINDOW w AS (PARTITION BY customer ORDER BY order_date DESC)
ORDER BY customer, order_date DESC
) s
WHERE order_date < CURRENT_DATE - 30
Explanation:
In both solutions I am working with the first_value window function. The window function's frame is defined by customers. The rows within the customers' groups are ordered descending by date which gives the latest row first (last_value is not working as expected every time). So it is possible to get the last order_date and the last order_total of this order.
The difference between both solutions is the filtering. I showed both versions because sometimes one of them is significantly faster
The window function style is creating a row count within the frames. Every first row can be filtered later. This is done by adding a row_number window function. The benefit of this solution comes out when you are trying to filter the first two or three data sets. You simply have to change the filter from WHERE row_count = 1 to WHERE row_count = 2
But if you want only one single row per group you just need to ensure that the expected row per group is ordered to be the first row in the group. Then the DISTINCT ON function can delete all following rows. DISTINCT ON (customer) gives the first (ordered) row per customer group.
Try to join table on itself
select o1.customer, max(order_date),
from orders o1
join orders o2 on o1.id=o2.id
group by o1.customer
having max(o1.order_date) < NOW() - '30 days'::interval
Subqueries in select is a bad idea, because DB will execute a query for each row
If you use postgres you can also try to use CTE
https://www.postgresql.org/docs/9.6/static/queries-with.html
WITH t as (
select id, order_total from orders o2 where o2.customer = orders.customer
order by order_date desc limit 1
) select o1.customer, max(order_date),
from orders o1
join t t.id=o2.id
group by o1.customer
having max(order_date) < NOW() - '30 days'::interval
AdventureWorks2012: Sql For each customer, determine the number of orders created in year 2007. If a customer has not created any order in year 2004, show 0 for that customer. Show: customer ID, # of orders created in 2007.
Same approach as in my other response - use a CTE (Common Table Expression) to determine number of sales for each customer in the year 2007:
-- determine the number and total of all sales in 2007
;WITH SalesPerCustomer AS
(
SELECT
c.CustomerID,
NumberOfSales = ISNULL(COUNT(soh.SalesOrderID), 0)
FROM
Sales.Customer c
INNER JOIN
Sales.SalesOrderHeader soh ON soh.CustomerID = c.CustomerID
AND soh.OrderDate >= '20070101'
AND soh.OrderDate < '20080101'
GROUP BY
c.CustomerID
)
SELECT
CustomerID ,
NumberOfSales
FROM
SalesPerCustomer
ORDER BY
NumberOfSales DESC
Ordered the output by number of sales descending, so you'll get the customer with the most sales first