how to fetch data quickly in join query? - postgresql

I have 3 tables users, orders and comments every tables has 10087250,24949600 and 26532000 much records, I made this query to counts comments on every order but it is taking more than half an hour to execute, how to speed up this query.
Note: there is already index on foreig_key columns.
select users.user_name, orders.id, count(comments.order_id)
from orders
inner join users on users.id=orders.user_id
inner join comments on orders.id=comments.order_id
group by comments.order_id, users.user_name, orders.id
limit 2;

For the first - probably yuo need ORDER BY clause to use it with LIMIT
If you need most commented pair you can ORDER BY count DESC
The second things comments.order_id = orders.id. Why do you use both for GROUP?
group by comments.order_id, users.user_name, orders.id
May be you can help something like this:
WITH grouped AS (
SELECT order_id AS id, count(*)
FROM comments
GROUP BY 1
ORDER BY 2 DESC
LIMIT 2
)
SELECT u.user_name, g.id, g.count
FROM grouped AS g
JOIN orders AS o ON
o.id = g.id
JOIN users AS u ON
u.id = o.user_id
This allows to avoid join all tables before filtering and grouping

You can try to use temporary tables before aggregating the records. This might help to reduce the query time. Something like this...
CREATE TEMPORARY TABLE temp_table(
...
);
INSERT INTO temp_table
SELECT users.user_name, orders.id, comments.order_id
FROM orders INNER JOIN users ON users.id = orders.user_id INNER JOIN comments ON orders.id = comments.order_id;
SELECT user_name, id, count(order_id) FROM temp_table group by order_id, user_name, id;

I think you need to reduce a unneccessary join between orders and comments tables. All you want to get from table comments is how many comments of an order, so you need to do denormalization.
It means you need to add a comments_count column into orders table, and when every a comment is added to an order, just increase it or decrease it if a comment of order is deleted.
After you add new comments_count column, you need to update comments_count for each order.
Then you can just load orders table and you already have comments count for each order.

Related

How to make a query with the data of another query on SQL?

I have a user table and another order table. A user can have many orders. How do I get the last 1000 users and those users use my last 1000 orders for each one?
Query - Get users
select distinct users.id, users.first_name, users.last_name
from users
limit 2;
Query - Get orders
select distinct orders.id, orders.user_id
from orders
limit 2;
First you need to know.. When it comes to query data don't do separate operation.. If you put out All Users and Orders Your data will be unordered and not consistency.. So you need to make Join to see All User with Order they have.. And as stated by #Sharon you need to add column date_ordered to see that order make.. I am assume you already have that column but i will call that column with date_ordered..
And your query will be :
select
users.id,
users.first_name,
users.last_name,
orders.id
from
users
inner join orders on users.id = orders.user_id
order by
orders.date_ordered desc
limit 1000
By order date_ordered use desc you will get the latest all the user with their order.. And i assume user_id column in table orders have constraint foreign key references to table users with column id..

Inner join with count and group by

I have 2 tables
Timetable :
pupil_id, staff_id, subject, lesson_id
Staff_info :
staff_id, surname
The timetable table contains 1000s of rows because each student's ID is listed under each period they do.
I want to list all the teacher's names, and the number of lessons they do (count). So I have to do SELECT with DISTINCT.
SELECT DISTINCT TIMETABLE.STAFF_ID,
COUNT(TIMETABLE.LESSON_ID),
STAFF.SURNAME
FROM STAFF
INNER JOIN TIMETABLE ON TIMETABLE.STAFF_ID = STAFF.STAFF_ID
GROUP BY TIMETABLE.STAFF_ID
However I get the error:
Column 'STAFF.SURNAME' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause.
This should do what you want:
SELECT s.STAFF_ID, COUNT(tt.LESSON_ID),
s.SURNAME
FROM STAFF s INNER JOIN
TIMETABLE tt
ON tt.STAFF_ID = s.STAFF_ID
GROUP BY s.STAFF_ID, s.SURNAME;
Notes:
You don't need DISTINCT unless there are duplicates in either table. That seems unlikely with this data structure, but if a staff member could have two of the same lesson, you would use COUNT(DISTINCT tt.LESSON_ID).
Table aliases make the query easier to write and to read.
You should include STAFF.SURNAME in the GROUP BY as well as the id.
I have a preference for taking the STAFF_ID column from the table where it is the primary key.
If you wanted staff with no lessons, you would change the INNER JOIN to LEFT JOIN.
SELECT T.STAFF_ID,
T.CNT,
S.SURNAME
FROM STAFF S
JOIN (
SELECT STAFF_ID, CNT = COUNT(/*DISTINCT*/ LESSON_ID)
FROM TIMETABLE
GROUP BY STAFF_ID
) T ON T.STAFF_ID = S.STAFF_ID
Another option:
SELECT DISTINCT si.staff_id, surname, COUNT(lesson_id) OVER(PARTITION BY staff_Id)
FROM Staff_info si
INNER JOIN Timetable tt ON si.staff_id = tt.staff_id
When using Aggregate function(Count, Sum, Min, Max, Avg) in the Select column's list, any other columns that are in the Select column's list but not in a aggregate function, should be mentioned in GROUP BY section too. So you need to change your query as follow and add STAFF.SURNAME to GROUP BY section too:
SELECT TIMETABLE.STAFF_ID,
COUNT(TIMETABLE.LESSON_ID),
STAFF.SURNAME
FROM STAFF
INNER JOIN TIMETABLE ON TIMETABLE.STAFF_ID = STAFF.STAFF_ID
GROUP BY TIMETABLE.STAFF_ID,STAFF.SURNAME
Distinct is useless also in your scenario. and also as you are going to show the teachers name and Count lessons, you do not need to add TIMETABLE.STAFF_ID to Select's column's list,, but it should remain in Group By section to prevent duplicate names.
SELECT COUNT(TIMETABLE.LESSON_ID),
STAFF.SURNAME
FROM STAFF
INNER JOIN TIMETABLE ON TIMETABLE.STAFF_ID = STAFF.STAFF_ID
GROUP BY TIMETABLE.STAFF_ID,STAFF.SURNAME
You may need to take a look at this W3C post for more info

Show field in MS Access query without including it in the group by clause

I'm working on a query that will eventually be used as the record source for a report.
I have a customers and orders table. I want to show customer_id, order_id, and order_date in a query, but I only want to show data associated with the earliest order date for each customer. Basically, I need to show the order_id field without including it in the group by clause. If I include it in the group by clause, I get a lot more records than I want. Based on my research, the code below will work in mysql, but not ms access.
Select customer.customer_id, order.order_id, min(order.order_dt)
From customer inner join order on customer.customer_id = order.customer_id
Group by customer.customer_id
I've tried grouping by order_id in a sub query and ordering by customer then date, then using the first function in the outer query. Unfortunately, the first function doesn't work as advertised.
Any help is greatly appreciated!
Does this work for you? It should bring up the earliest orders by order date for each customer. If there is more than one order on the earliest order date for a customer, all of those orders will be shown, though, so keep it in mind.
SELECT c.customer_id, o.order_id, o.order_dt
FROM customers AS c INNER JOIN (orders AS o INNER JOIN (SELECT customer_ID, MIN([order_dt]) AS MinOrder_dt FROM Orders GROUP BY customer_id) AS d ON (o.Customer_ID = d.customer_id) AND (o.[order_dt] = d.MinOrder_dt)) ON c.customer_id = o.customer_id;
I am deriving a table with just the customer_id and the min order_dt and joining customers and orders to that to only bring up the oldest orders.

How does COUNT(*) behave in an inner join

Take this query:
SELECT c.CustomerID, c.AccountNumber, COUNT(*) AS CountOfOrders,
SUM(s.TotalDue) AS SumOfTotalDue
FROM Sales.Customer AS c
INNER JOIN Sales.SalesOrderheader AS s ON c.CustomerID = s.CustomerID
GROUP BY c.CustomerID, c.AccountNumber
ORDER BY c.CustomerID;
I expected COUNT(*) to count the rows in Sales.Customer but to my surprise it counts the number of rows in the joined table.
Any idea why this is? Also, is there a way to be explicit in specifying which table COUNT() should operate on?
Query Processing Order...
The FROM clause is processed before the SELECT clause -- which is to say -- by the time SELECT comes into play, there is only one (virtual) table it is selecting from -- namely, the individual tables after their joined (JOIN), filtered (WHERE), etc.
If you just want to count over the one table, then you might try a couple of things...
COUNT(DISTINCT table1.id)
Or turn the table you want to count into a sub-query with count() inside of it

PostgreSQL: Select first row as column inside select

I got 2 tables like Customers and Orders, in table Customers I got columns id, name, in table Orders I got columns id, customer_id, order_date.
Now I need to make one select that will return me each Customer's id, name and the last order_date.
I tried to make like this:
select
Customers.id,
Customers.name,
(select Orders.order_date from Orders where Orders.customer_id = Customer.id order by order_date desc) as last_order_date
from
Customers
But it get the wrong index and takes forever to execute.
Whats the best way to make this select in PostgreSQL?
Thanks in advanced.
If not restricting by customer_id, then the query will end up having to scan the entire orders table.
SELECT c.id
,c.name
,MAX(o.order_date) AS last_order_date
FROM Customers c
LEFT OUTER JOIN Orders o ON (o.customer_id = c.id)
GROUP BY c.id, c.name