I usually use ORM instead of SQL and I am slightly out of touch on the different JOINs...
SELECT `order_invoice`.*
, `client`.*
, `order_product`.*
, SUM(product.cost) as net
FROM `order_invoice`
LEFT JOIN `client`
ON order_invoice.client_id = client.client_id
LEFT JOIN `order_product`
ON order_invoice.invoice_id = order_product.invoice_id
LEFT JOIN `product`
ON order_product.product_id = product.product_id
WHERE (order_invoice.date_created >= '2009-01-01')
AND (order_invoice.date_created <= '2009-02-01')
GROUP BY `order_invoice`.`invoice_id`
The tables/ columns are logically names... it's an shop type application... the query works... it's just very very slow...
I use the Zend Framework and would usually use Zend_Db_Table_Row::find(Parent|Dependent)Row(set)('TableClass') but I have to make lots of joins and I thought it'll improve performance by doing it all in one query instead of hundreds...
Can I improve the above query by using more appropriate JOINs or a different implementation? Many thanks.
The query is wrong, the GROUP BY is wrong. All columns in the SELECT-part that are not in an aggregate function, have to be in the GROUP BY. You mention only one column.
Change the SQL Mode, set it to ONLY_FULL_GROUP_BY.
When this is done and you have a correct query, use EXPLAIN to find out how the query is executed and what indexes are used. Then start optimizing.
Related
I have a custom query along these lines. I get the list of orderIds from outside. I have the entire order object list with me, so I can change the query in any way, if needed.
#Query("SELECT p FROM Person p INNER JOIN p.orders o WHERE o.orderId in :orderIds)")
public List<Person> findByOrderIds(#Param("orderIds") List<String> orderIds);
This query works fine, but sometimes it may have anywhere between 50-1000 entries in the orderIds list sent from outside function. So it becomes very slow, taking as much as 5-6 seconds which is not fast enough. My question is, is there a better, faster way to do this? When I googled, and on this site, I see we can use ANY, EXISTS: Postgresql: alternative to WHERE IN respective WHERE NOT IN or create a temporary table: https://dba.stackexchange.com/questions/12607/ways-to-speed-up-in-queries-under-postgresql or join this to VALUES clause: Alternative when IN clause is inputed A LOT of values (postgreSQL). All these answers are tailored towards direct SQL calls, nothing based on JPA. ANY keyword is not supported by spring-data. Not sure about creating temporary tables in custom queries. I think I can do it with native queries, but have not tried it. I am using spring-data + OpenJPA + PostgresSQL.
Can you please suggest a solution or give pointers? I apologize if I missed anything.
thanks,
Alice
You can use WHERE EXISTS instead of IN Clause in a native SQL Query as well as in HQL in JPA which results in a lot of performance benefits. Please see sample below
Sample JPA Query:
SELECT emp FROM Employee emp JOIN emp.projects p where NOT EXISTS (SELECT project from Project project where p = project AND project.status <> 'Active')
So I got 3 tables:
Locations (LocationID [PK])
Track_Locations (TrackID [FK], LocationID [FK])
Tracks (TrackID [PK])
I'm building an app for iPhone and I was told that a JOIN would probably work faster then a subquery would. Is this true?
I have searched around but can't find a straight answer to this and would like to know what is best to use in this case.
My queries in both cases would look like this:
Subquery:
SELECT LocationID, TimestampGPS, Longitude, Latitude, ...
FROM locations
WHERE LocationID
IN (SELECT LocationID FROM track_locations WHERE TrackID = ?)
Using JOIN:
SELECT l.LocationID, l.TimestampGPS, l.Longitude, l.Latitude, ...
FROM locations l
JOIN track_locations tl
ON (l.LocationID = tl.LocationID)
JOIN tracks t
ON (t.TrackID = tl.TrackID)
WHERE t.TrackID = ?
A JOIN is almost always faster than a subquery, because it results in a single query, whereas a subquery is two queries. Some databases, however, know how to optimize such simple subqueries into JOINs, though I doubt SQLite does.
That said, sometimes, given the structure of one's data, it is faster to do a subquery. The only real way to find out is to load your database with a likely collection of sample data from the wild and benchmark both approaches.
My suspicion is that it will be a wash, however. Most iPhone apps don't have enough rows in the database to make much difference.
First of all, English it's not my first language, feel free to edit my question and I'm sorry for any mistakes that can offend you or not being so clear exposing the problem.
I have a few sql queries with lots of joins, these joins are based on clustered index (no worries about that). Some of the joins are used only to respect normalization and because is intuitive to maintenance, but sometimes it's possible to skip some of then. It's not clear to me what to do about these joins in terms of best practices.
Edit:
A simple example:
select *
from things
join things_categories on
things_categories.id_thing = things.id_thing
join categories on
categories.id_category = things_categories.id_category
join categories_properties on
categories_properties.id_category = categories.id_category
where
categories_properties.bo_default = 1
But it's possible to do:
select *
from things
join things_categories on
things_categories.id_thing = things.id_thing
join categories_properties on
categories_properties.id_category = things_categories.id_category
where
categories_properties.bo_default = 1
The second join it's not necessary (I do have integrity at database level), it's there only because makes the code more intuitive and respect the database normalization. I'm not sure if I should follow the smallest possible and efficient path or leave unnecessary joins to respect normalization and make the code more intuitive.
Any tips?
All the best.
It deppends, wheter you've or not integrity already.
In one hand, if the categories_properties table has a foreign key in the id_category column, then the integrity exists and you don't need to make the join with the categories table.
On the other hand, if the integrity might not exist (i.e.: there are id_categories in categories_properties table that are not defined in categories table), then you should make the join.
The join:
join categories on
categories.id_category = things_categories.id_category
is very necessary, since the categories table is used in the next join:
join categories_properties on
categories_properties.id_category = categories.id_category
So it's definitely required, if it's not already defined, as SQL requires for you to establish the links it needs to index and join one to the next.
What is however very painful, is the select *.
You don't need all that info, since * will bring all data from all tables.
Perhaps you could specify what you need from each table or, at worst, use things.* to specify all columns of a specific table.
If you do not need a join do not use it. You are taking a totally unneeded performance hit. Don't force the database to do work it doesn't need to do because you think it looks more comlete, you should consider performance ahead of readability in a query. After all once you start writing performant SQl code, it will become more readable to you. However, make sure you actually don't need it before eliminating it by making sure both versions of the query return the same result set.
Example:
SELECT `cat`.`id_catalog`, COUNT(parent.id_catalog) - 1) AS `level` FROM `tbl_catalog` AS `cat`, `tbl_catalog` AS `parent` WHERE (cat.`left` BETWEEN parent.`left` AND parent.`right`) GROUP BY `cat`.`id_catalog` ORDER BY `cat`.`left` ASC
It doesn't seem to work if it use ZF. ZF create this query with join only. How to create the select without join in ZF_DB.
By the way may be I do smth wrong in this query. It is simple nested set DB with parent, left and right fields. Perhaps there are another way to use join to get deep for some node. Anyway it would be interesting to get answer for both way:)
Thanks in advance to all who looks it through:)
I am currently using this sqlite query in my application. Two tables are used in this query.....
UPDATE table1 set visited = (SELECT COUNT(DISTINCT table1.itemId) from 'table2' WHERE table2.itemId = table1.itemId AND table2.sessionId ='eyoge2avao');
It is working correct.... My problem is it is taking around 10 seconds to execute this query and retrieve the result..... Don't know what to do... Almost all other process are in right way.. So it seems the problem is with this query formation...
Plz someone help with how to optimize this query....
Regards,
Brian
Make sure you have indexes on the following (combinations of) fields:
table1.itemId
(This will speed up the DISTINCT clause, since the itemId will already be in the correct order).
table2.itemId, table2.sessionId
This will speed up the WHERE clause of your SELECT statement.
How many rows are there in these tables?
Aso try doing an EXPLAIN on your SELECT command to see if it gives you any helpful advice.