I'm having a very wierd dataset, where several records from large table has any data at all, but when they do it's hundredths of thousands of records.
I'm trying to select only records that have data but I'm having some problems with index usage. I know you cannot usually "force" postgresql to use certain index but in this case it works.
SELECT matches.id, count(frames.id) FROM matches LEFT JOIN frames ON frames.match_id = matches.id GROUP BY matches.id HAVING count(frames.id) > 0 ORDER BY count(frames.id) DESC;
id | count
----+--------
31 | 123363
28 | 121475
24 | 110155
21 | 108258
22 | 106837
25 | 89182
26 | 87104
27 | 86152
(8 rows)
SELECT matches.id, count(frames.id) FROM matches LEFT JOIN frames ON frames.match_id = matches.id GROUP BY matches.id HAVING count(frames.id) = 0 ORDER BY count(frames.id) DESC;
....
(568 rows)
Two solutions I've found would be:
SELECT "matches".* FROM "matches" WHERE EXISTS (SELECT true FROM frames WHERE frames.match_id = matches.id LIMIT 1);
Time: 11697,645 ms
or
SELECT DISTINCT "matches".* FROM "matches" INNER JOIN "frames" ON "frames"."match_id" = "matches"."id"
Time: 879,325 ms
Neither query seems to use index on match_id in frames table. It's understendable since normally it's not very selective, unfortunately here it would be really helpful. As:
SET enable_seqscan = OFF;
SELECT "matches".* FROM "matches" WHERE (SELECT true FROM frames WHERE frames.match_id = matches.id LIMIT 1);
Time: 1,239 ms
EXPLAIN for queries:
EXPLAIN for: SELECT DISTINCT "matches".* FROM "matches" INNER JOIN "frames" ON "frames"."match_id" = "matches"."id"
QUERY PLAN
-----------------------------------------------------------------------------
HashAggregate (cost=59253.47..59256.38 rows=290 width=155)
-> Hash Join (cost=6.26..33716.73 rows=785746 width=155)
Hash Cond: (frames.match_id = matches.id)
-> Seq Scan on frames (cost=0.00..22906.46 rows=785746 width=4)
-> Hash (cost=4.45..4.45 rows=145 width=155)
-> Seq Scan on matches (cost=0.00..4.45 rows=145 width=155)
(6 rows)
EXPLAIN for: SELECT "matches".* FROM "matches" WHERE (EXISTS (SELECT id FROM frames WHERE frames.match_id = matches.id LIMIT 1))
QUERY PLAN
Seq Scan on matches (cost=0.00..41.17 rows=72 width=155)
Filter: (SubPlan 1)
SubPlan 1
-> Limit (cost=0.00..0.25 rows=1 width=4)
-> Seq Scan on frames (cost=0.00..24870.83 rows=98218 width=4)
Filter: (match_id = matches.id)
(6 rows)
SET enable_seqscan = OFF;
EXPLAIN SELECT "matches".* FROM "matches" WHERE (SELECT true FROM frames WHERE frames.match_id = matches.id LIMIT 1);
QUERY PLAN
Seq Scan on matches (cost=10000000000.00..10000000118.37 rows=72 width=155)
Filter: (SubPlan 1)
SubPlan 1
-> Limit (cost=0.00..0.79 rows=1 width=0)
-> Index Scan using index_frames_on_match_id on frames (cost=0.00..81762.68 rows=104066 width=0)
Index Cond: (match_id = matches.id)
(6 rows)
Any suggestions how to tweek the query to use index here? Maybe other ways to chec for existance of recrs that would execute closer to 1ms I get out of index then 11s ?
PS. I did run ANALYZE, VACUM ANALYZE, all the steps normally suggested to improve index usage.
EDIT Thanks for David Aldridge pointing out that LIMIT 1 might be actually hindering query planner I've gotten now to:
SELECT "matches".* FROM "matches" WHERE EXISTS (SELECT true FROM frames WHERE frames.match_id = matches.id);
Time: 163,803 ms
With the plan:
EXPLAIN SELECT "matches".* FROM "matches" WHERE EXISTS (SELECT true FROM frames WHERE frames.match_id = matches.id);
QUERY PLAN
------------------------------------------------------------------------------------
Nested Loop (cost=25455.58..25457.90 rows=8 width=155)
-> HashAggregate (cost=25455.58..25455.66 rows=8 width=4)
-> Seq Scan on frames (cost=0.00..23374.26 rows=832526 width=4)
-> Index Scan using matches_pkey on matches (cost=0.00..0.27 rows=1 width=155)
Index Cond: (id = frames.match_id)
(5 rows)
Still 100 times slower with index only version (probably because of Seq Scan + Hash Aggregate on frames that's still performed)
In the EXISTS-based alternative, the LIMIT clause is redundant but might not be helping the optimiser.
Try:
SELECT "matches".*
FROM "matches"
WHERE EXISTS (SELECT 1
FROM frames
WHERE frames.match_id = matches.id);
Related
I'm trying to migrate from SQL Server to Postgresql.
Here is my Posgresql code:
Create View person_names As
SELECT lp."Code", n."Name", n."Type"
from "Persons" lp
Left Join LATERAL
(
Select *
From "Names" n
Where n.id = lp.id
Order By "Date" desc
Limit 1
) n on true
limit 100;
Explain
Select "Code" From person_names;
It prints
"Subquery Scan on person_names (cost=0.42..448.85 rows=100 width=10)"
" -> Limit (cost=0.42..447.85 rows=100 width=56)"
" -> Nested Loop Left Join (cost=0.42..303946.91 rows=67931 width=56)"
" -> Seq Scan on ""Persons"" lp (cost=0.00..1314.31 rows=67931 width=10)"
" -> Limit (cost=0.42..4.44 rows=1 width=100)"
" -> Index Only Scan Backward using ""IX_Names_Person"" on ""Names"" n (cost=0.42..4.44 rows=1 width=100)"
" Index Cond: ("id" = (lp."id")::numeric)"
Why there is an "Index Only Scan" for the "Names" table? This table is not required to get a result. On SQL Server I get only a single scan over the "Persons" table.
How can I tune Postgres to get a better query plans? I'm trying the lastest version, which is the Postgresql 15 beta 3.
Here is SQL Server version:
Create View person_names As
SELECT top 100 lp."Code", n."Name", n."Type"
from "Persons" lp
Outer Apply
(
Select Top 1 *
From "Names" n
Where n.id = lp.id
Order By "Date" desc
) n
GO
SET SHOWPLAN_TEXT ON;
GO
Select "Code" From person_names;
It gives correct execution plan:
|--Top(TOP EXPRESSION:((100)))
|--Index Scan(OBJECT:([Persons].[IX_Persons] AS [lp]))
Change the lateral join to a regular left join, then Postgres is able to remove the select on the Names table:
create View person_names
As
SELECT lp.Code, n.Name, n.Type
from Persons lp
Left Join (
Select distinct on (id) *
From Names n
Order By id, Date desc
) n on n.id = lp.id
limit 100;
The following index will support the distinct on () in case you do include columns from the Names table:
create index on "Names"(id, "Date" desc);
For select code from names this gives me this plan:
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Seq Scan on persons lp (cost=0.00..309.00 rows=20000 width=7) (actual time=0.009..1.348 rows=20000 loops=1)
Planning Time: 0.262 ms
Execution Time: 1.738 ms
For select Code, name, type From person_names; this gives me this plan:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Hash Right Join (cost=559.42..14465.93 rows=20000 width=25) (actual time=5.585..68.545 rows=20000 loops=1)
Hash Cond: (n.id = lp.id)
-> Unique (cost=0.42..13653.49 rows=20074 width=26) (actual time=0.053..57.323 rows=20000 loops=1)
-> Index Scan using names_id_date_idx on names n (cost=0.42..12903.49 rows=300000 width=26) (actual time=0.052..41.125 rows=300000 loops=1)
-> Hash (cost=309.00..309.00 rows=20000 width=11) (actual time=5.407..5.407 rows=20000 loops=1)
Buckets: 32768 Batches: 1 Memory Usage: 1116kB
-> Seq Scan on persons lp (cost=0.00..309.00 rows=20000 width=11) (actual time=0.011..2.036 rows=20000 loops=1)
Planning Time: 0.460 ms
Execution Time: 69.180 ms
Of course I had to guess the table structures as you haven't provided any DDL.
Online example
Change your view definition like that
create view person_names as
select p."Code",
(select "Name"
from "Names" n
where n.id = p.id
order by "Date" desc
limit 1)
from "Persons" p
limit 100;
When joining on a table and then filtering (LIMIT 30 for instance), Postgres will apply a JOIN operation on all rows, even if the columns from those rows is only used in the returned column, and not as a filtering predicate.
This would be understandable for an INNER JOIN (PG has to know if the row will be returned or not) or for a LEFT JOIN without a unique constraint (PG has to know if more than one row will be returned or not), but for a LEFT JOIN on a UNIQUE column, this seems wasteful: if the query matches 10k rows, then 10k joins will be performed, and then only 30 will be returned.
It would seem more efficient to "delay", or defer, the join, as much as possible, and this is something that I've seen happen on some other queries.
Splitting this into a subquery (SELECT * FROM (SELECT * FROM main WHERE x LIMIT 30) LEFT JOIN secondary) works, by ensuring that only 30 items are returned from the main table before joining them, but it feels like I'm missing something, and the "standard" form of the query should also apply the same optimization.
Looking at the EXPLAIN plans, however, I can see that the number of rows joined is always the total number of rows, without "early bailing out" as you could see when, for instance, running a Seq Scan with a LIMIT 5.
Example schema, with a main table and a secondary one: secondary columns will only be returned, never filtered on.
drop table if exists secondary;
drop table if exists main;
create table main(id int primary key not null, main_column int);
create index main_column on main(main_column);
insert into main(id, main_column) SELECT i, i % 3000 from generate_series( 1, 1000000, 1) i;
create table secondary(id serial primary key not null, main_id int references main(id) not null, secondary_column int);
create unique index secondary_main_id on secondary(main_id);
insert into secondary(main_id, secondary_column) SELECT i, (i + 17) % 113 from generate_series( 1, 1000000, 1) i;
analyze main;
analyze secondary;
Example query:
explain analyze verbose select main.id, main_column, secondary_column
from main
left join secondary on main.id = secondary.main_id
where main_column = 5
order by main.id
limit 50;
This is the most "obvious" way of writing the query, takes on average around 5ms on my computer.
Explain:
Limit (cost=3742.93..3743.05 rows=50 width=12) (actual time=5.010..5.322 rows=50 loops=1)
Output: main.id, main.main_column, secondary.secondary_column
-> Sort (cost=3742.93..3743.76 rows=332 width=12) (actual time=5.006..5.094 rows=50 loops=1)
Output: main.id, main.main_column, secondary.secondary_column
Sort Key: main.id
Sort Method: top-N heapsort Memory: 27kB
-> Nested Loop Left Join (cost=11.42..3731.90 rows=332 width=12) (actual time=0.123..4.446 rows=334 loops=1)
Output: main.id, main.main_column, secondary.secondary_column
Inner Unique: true
-> Bitmap Heap Scan on public.main (cost=11.00..1036.99 rows=332 width=8) (actual time=0.106..1.021 rows=334 loops=1)
Output: main.id, main.main_column
Recheck Cond: (main.main_column = 5)
Heap Blocks: exact=334
-> Bitmap Index Scan on main_column (cost=0.00..10.92 rows=332 width=0) (actual time=0.056..0.057 rows=334 loops=1)
Index Cond: (main.main_column = 5)
-> Index Scan using secondary_main_id on public.secondary (cost=0.42..8.12 rows=1 width=8) (actual time=0.006..0.006 rows=1 loops=334)
Output: secondary.id, secondary.main_id, secondary.secondary_column
Index Cond: (secondary.main_id = main.id)
Planning Time: 0.761 ms
Execution Time: 5.423 ms
explain analyze verbose select m.id, main_column, secondary_column
from (
select main.id, main_column
from main
where main_column = 5
order by main.id
limit 50
) m
left join secondary on m.id = secondary.main_id
where main_column = 5
order by m.id
limit 50
This returns the same results, in 2ms.
The total EXPLAIN cost is also three times higher, in line with the performance gain we're seeing.
Limit (cost=1048.44..1057.21 rows=1 width=12) (actual time=1.219..2.027 rows=50 loops=1)
Output: m.id, m.main_column, secondary.secondary_column
-> Nested Loop Left Join (cost=1048.44..1057.21 rows=1 width=12) (actual time=1.216..1.900 rows=50 loops=1)
Output: m.id, m.main_column, secondary.secondary_column
Inner Unique: true
-> Subquery Scan on m (cost=1048.02..1048.77 rows=1 width=8) (actual time=1.201..1.515 rows=50 loops=1)
Output: m.id, m.main_column
Filter: (m.main_column = 5)
-> Limit (cost=1048.02..1048.14 rows=50 width=8) (actual time=1.196..1.384 rows=50 loops=1)
Output: main.id, main.main_column
-> Sort (cost=1048.02..1048.85 rows=332 width=8) (actual time=1.194..1.260 rows=50 loops=1)
Output: main.id, main.main_column
Sort Key: main.id
Sort Method: top-N heapsort Memory: 27kB
-> Bitmap Heap Scan on public.main (cost=11.00..1036.99 rows=332 width=8) (actual time=0.054..0.753 rows=334 loops=1)
Output: main.id, main.main_column
Recheck Cond: (main.main_column = 5)
Heap Blocks: exact=334
-> Bitmap Index Scan on main_column (cost=0.00..10.92 rows=332 width=0) (actual time=0.029..0.030 rows=334 loops=1)
Index Cond: (main.main_column = 5)
-> Index Scan using secondary_main_id on public.secondary (cost=0.42..8.44 rows=1 width=8) (actual time=0.004..0.004 rows=1 loops=50)
Output: secondary.id, secondary.main_id, secondary.secondary_column
Index Cond: (secondary.main_id = m.id)
Planning Time: 0.161 ms
Execution Time: 2.115 ms
This is a toy dataset here, but on a real DB, the IO difference is significant (no need to fetch 1000 rows when 30 are enough), and the timing difference also quickly adds up (up to an order of magnitude slower).
So my question: is there any way to get the planner to understand that the JOIN can be applied much later in the process?
It seems like something that could be applied automatically to gain a sizeable performance boost.
Deferred joins are good. It's usually helpful to run the limit operation on a subquery that yields only the id values. The order by....limit operation has to sort less data just to discard it.
select main.id, main.main_column, secondary.secondary_column
from main
join (
select id
from main
where main_column = 5
order by id
limit 50
) selection on main.id = selection.id
left join secondary on main.id = secondary.main_id
order by main.id
limit 50
It's also possible adding id to your main_column index will help. With a BTREE index the query planner knows it can get the id values in ascending order from the index, so it may be able to skip the sort step entirely and just scan the first 50 values.
create index main_column on main(main_column, id);
Edit In a large table, the heavy lifting of your query will be the selection of the 50 main.id values to process. To get those 50 id values as cheaply as possible you can use a scan of the covering index I proposed with the subquery I proposed. Once you've got your 50 id values, looking up 50 rows' worth of details from your various tables by main.id and secondary.main_id is trivial; you have the correct indexes in place and it's a limited number of rows. Because it's a limited number of rows it won't take much time.
It looks like your table sizes are too small for various optimizations to have much effect, though. Query plans change a lot when tables are larger.
Alternative query, using row_number() instead of LIMIT (I think you could even omit LIMIT here):
-- prepare q3 AS
select m.id, main_column, secondary_column
from (
select id, main_column
, row_number() OVER (ORDER BY id, main_column) AS rn
from main
where main_column = 5
) m
left join secondary on m.id = secondary.main_id
WHERE m.rn <= 50
ORDER BY m.id
LIMIT 50
;
Puttting the subsetting into a CTE can avoid it to be merged into the main query:
PREPARE q6 AS
WITH
-- MATERIALIZED -- not needed before version 12
xxx AS (
SELECT DISTINCT x.id
FROM main x
WHERE x.main_column = 5
ORDER BY x.id
LIMIT 50
)
select m.id, m.main_column, s.secondary_column
from main m
left join secondary s on m.id = s.main_id
WHERE EXISTS (
SELECT *
FROM xxx x WHERE x.id = m.id
)
order by m.id
-- limit 50
;
I have a table with 3 columns and composite primary key with all the 3 columns. All the individual columns have lot of duplicates and I have btree separately on all of them. The table has around 10 million records.
My query with just a condition with a hardcoded value for single column would always return more than a million records. It takes more than 40 secs whereas it takes very few seconds if I limit the query to 1 or 2 million rows without any condition.
Any help to optimize it as there is no bitmap index in Postgres? All 3 columns have lots of duplicates, would it help if I drop the btree index on them?
SELECT t1.filterid,
t1.filterby,
t1.filtertype
FROM echo_sm.usernotificationfilters t1
WHERE t1.filtertype = 9
UNION
SELECT t1.filterid, '-1' AS filterby, 9 AS filtertype
FROM echo_sm.usernotificationfilters t1
WHERE NOT EXISTS (SELECT 1
FROM echo_sm.usernotificationfilters t2
WHERE t2.filtertype = 9 AND t2.filterid = t1.filterid);
Filtertype column is integer and the rest 2 are varchar(50). All 3 columns have separate btree indexes on them.
Explain plan:
Unique (cost=2168171.15..2201747.47 rows=3357632 width=154) (actual time=32250.340..36371.928 rows=3447159 loops=1)
-> Sort (cost=2168171.15..2176565.23 rows=3357632 width=154) (actual time=32250.337..35544.050 rows=4066447 loops=1)
Sort Key: usernotificationfilters.filterid, usernotificationfilters.filterby, usernotificationfilters.filtertype
Sort Method: external merge Disk: 142696kB
-> Append (cost=62854.08..1276308.41 rows=3357632 width=154) (actual time=150.155..16025.874 rows=4066447 loops=1)
-> Bitmap Heap Scan on usernotificationfilters (cost=62854.08..172766.46 rows=3357631 width=25) (actual time=150.154..574.297 rows=3422522 loops=1)
Recheck Cond: (filtertype = 9)
Heap Blocks: exact=39987
-> Bitmap Index Scan on index_sm_usernotificationfilters_filtertype (cost=0.00..62014.67 rows=3357631 width=0) (actual time=143.585..143.585 rows=3422522 loops=1)
Index Cond: (filtertype = 9)
-> Gather (cost=232131.85..1069965.63 rows=1 width=50) (actual time=3968.492..15133.812 rows=643925 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Hash Anti Join (cost=231131.85..1068965.53 rows=1 width=50) (actual time=4135.235..12945.029 rows=214642 loops=3)
Hash Cond: ((usernotificationfilters_1.filterid)::text = (usernotificationfilters_1_1.filterid)::text)
-> Parallel Seq Scan on usernotificationfilters usernotificationfilters_1 (cost=0.00..106879.18 rows=3893718 width=14) (actual time=0.158..646.432 rows=3114974 loops=3)
-> Hash (cost=172766.46..172766.46 rows=3357631 width=14) (actual time=4133.991..4133.991 rows=3422522 loops=3)
Buckets: 131072 Batches: 64 Memory Usage: 3512kB
-> Bitmap Heap Scan on usernotificationfilters usernotificationfilters_1_1 (cost=62854.08..172766.46 rows=3357631 width=14) (actual time=394.775..1891.931 rows=3422522 loops=3)
Recheck Cond: (filtertype = 9)
Heap Blocks: exact=39987
-> Bitmap Index Scan on index_sm_usernotificationfilters_filtertype (cost=0.00..62014.67 rows=3357631 width=0) (actual time=383.635..383.635 rows=3422522 loops=3)
Index Cond: (filtertype = 9)
Planning time: 0.467 ms
Execution time: 36531.763 ms
The second subquery in your UNION takes about 15 seconds all by itself, and that could possibly be optimized separately from the rest of the query.
The sort to implement the duplicate removal implied by UNION takes about 20 seconds all by itself. It spills to disk. You could increase "work_mem" until it either stops spilling to disk, or starts using a hash rather than a sort. Of course you do need to have the RAM to backup your setting of "work_mem".
A third possibility would be not to treat these steps in isolation. If you had an index which would allow the data to be read from the 2nd branch of the union already in order, than it might not have to re-sort the whole thing. That would probably be an index on (filterid, filterby, filtertype).
This is a separate independent way to approach it.
I think your
WHERE NOT EXISTS (SELECT 1...
could be correctly changed to
WHERE t1.filtertype <> 9 NOT EXISTS AND (SELECT 1...
because the case where t1.filtertype=9 would filter itself out. Is that correct? If so, you could try writing it that way, as the planner is probably not smart enough to make that transformation on its own. Once you have done that, than maybe a filtered index, something like the below, would come in useful.
create index on echo_sm.usernotificationfilters (filterid, filterby, filtertype)
where filtertype <> 9
But, unless you get rid of or speed up that sort, there is only so much improvement you can get with other things.
It appears you only want to retrieve one record per filterid: a record with filtertype = 9 if available, or just another, with dummy values for the other columns. This can be done by ordering BY (filtertype<>9), filtertype ) and picking only the first row via row_number() = 1:
-- EXPLAIN ANALYZE
SELECT xx.filterid
, case(xx.filtertype) when 9 then xx.filterby ELSE '-1' END AS filterby
, 9 AS filtertype -- xx.filtertype
-- , xx.rn
FROM (
SELECT t1.filterid , t1.filterby , t1.filtertype
, row_number() OVER (PARTITION BY t1.filterid ORDER BY (filtertype<>9), filtertype ) AS rn
FROM userfilters t1
) xx
WHERE xx.rn = 1
-- ORDER BY xx.filterid, xx.rn
;
This query can be supported by a an index on the same expression:
CREATE INDEX ON userfilters ( filterid , (filtertype<>9), filtertype ) ;
But, on my machine the UNION ALL version is faster (using the same index):
EXPLAIN ANALYZE
SELECT t1.filterid
, t1.filterby
, t1.filtertype
FROM userfilters t1
WHERE t1.filtertype = 9
UNION ALL
SELECT DISTINCT t1.filterid , '-1' AS filterby ,9 AS filtertype
FROM userfilters t1
WHERE NOT EXISTS (
SELECT *
FROM userfilters t2
WHERE t2.filtertype = 9 AND t2.filterid = t1.filterid
)
;
Even simpler (and faster!) is to use DISTINCT ON() , supported by the same conditional index:
-- EXPLAIN ANALYZE
SELECT DISTINCT ON (t1.filterid)
t1.filterid
, case(t1.filtertype) when 9 then t1.filterby ELSE '-1' END AS filterby
, 9 AS filtertype -- t1.filtertype
FROM userfilters t1
ORDER BY t1.filterid , (t1.filtertype<>9), t1.filtertype
;
I have two tables:
fccuser=# select count(*) from public.fine_collection where user_id = 5000;
count
-------
2500
(1 row)
fccuser=# select count(*) from public.police_notice where user_id = 5000;
count
-------
1011
(1 row)
And when I issue
fccuser=# select count(*)
from public.fine_collection, public.police_notice
where fine_collection.user_id = 5000
and fine_collection.user_id = police_notice.user_id;
I was expecting 2500 but I got
count
2527500
(1 row)
i.e., a Cartesian product of the two. And analyze is:
fccuser=# explain analyze verbose select count(*) from public.fine_collection, public.police_notice where fine_collection.user_id = 5000 and fine_collection.user_id = police_notice.user_id;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=47657.20..47657.21 rows=1 width=0) (actual time=1991.552..1991.552 rows=1 loops=1)
Output: count(*)
-> Nested Loop (cost=0.86..39760.60 rows=3158640 width=0) (actual time=0.448..1462.155 rows=2527500 loops=1)
-> Index Only Scan using idx_user_id on public.fine_collection (cost=0.43..265.98 rows=8774 width=8) (actual time=0.213..2.448 rows=2500 loops=1)
Output: fine_collection.user_id
Index Cond: (fine_collection.user_id = 5000)
Heap Fetches: 1771
-> Materialize (cost=0.42..12.52 rows=360 width=2) (actual time=0.000..0.205 rows=1011 loops=2500)
Output: police_notice.user_id
-> Index Only Scan using idx_pn_userid on public.police_notice (cost=0.42..10.72 rows=360 width=2) (actual time=0.217..1.101 rows=1011 loops=1)
Output: police_notice.user_id
Index Cond: (police_notice.user_id = 5000)
Heap Fetches: 751
Planning time: 2.126 ms
Execution time: 1991.697 ms
(15 rows)
And postgres documentation states when join is performed on non-primary columns it first creates a Cartesian product (cross join) and then applies the filter. But I think Cartesian product would have all rows with same user_id in my case, so not sure how filter can be applied
Same happens with left join, inner join etc., only subquery seems to give correct result of 2500.
I am reasonably sure it does not work this way in MySQL though. Any thoughts?
Thank you
The result of your join is correct. You join every collection with user_id 5000 with every police notice with the same user_id. You have 2500 and 1011 rows joined together, and this produces 2527500 new rows.
You're using legacy join syntax, so here's the query rephrased to use a readable ANSI join.
SELECT
count(*)
FROM public.fine_collection
INNER JOIN public.police_notice
ON (fine_collection.user_id = police_notice.user_id)
WHERE fine_collection.user_id = 5000;
So, you're doing a count(*). This counts all rows in the cross product of both tables that match the join condition and where clause.
In other words, the result is the the number of rows with user_id = 5000 in each table, multiplied together.
Your query does the same thing as
SELECT
(SELECT count(*) FROM public.fine_collection WHERE user_id = 5000)
*
(SELECT count(*) FROM public.police_notice.user_id WHERE user_id = 5000);
and yes, 2500 * 1011 = 2527500 so that's exactly right.
If you expect 2500, you need to join on or group on a key in fine_collection.
I have a table in Redshift with a few billion rows which looks like this
CREATE TABLE channels AS (
fact_key TEXT NOT NULL distkey
job_key BIGINT
channel_key TEXT NOT NULL
)
diststyle key
compound sortkey(job_key, channel_key);
When I query by job_key + channel_key my seq scan is properly restricted by the full sortkey if I use specific values for channel_key in my query.
EXPLAIN
SELECT * FROM channels scd
WHERE scd.job_key = 1 AND scd.channel_key IN ('1234', '1235', '1236', '1237')
XN Seq Scan on channels scd (cost=0.00..3178474.92 rows=3428929 width=77)
Filter: ((((channel_key)::text = '1234'::text) OR ((channel_key)::text = '1235'::text) OR ((channel_key)::text = '1236'::text) OR ((channel_key)::text = '1237'::text)) AND (job_key = 1))
However if I query against channel_key by using IN + a subquery Redshift does not use the sortkey.
EXPLAIN
SELECT * FROM channels scd
WHERE scd.job_key = 1 AND scd.channel_key IN (select distinct channel_key from other_channel_list where job_key = 14 order by 1)
XN Hash IN Join DS_DIST_ALL_NONE (cost=3.75..3540640.36 rows=899781 width=77)
Hash Cond: (("outer".channel_key)::text = ("inner".channel_key)::text)
-> XN Seq Scan on channels scd (cost=0.00..1765819.40 rows=141265552 width=77)
Filter: (job_key = 1)
-> XN Hash (cost=3.75..3.75 rows=1 width=402)
-> XN Subquery Scan "IN_subquery" (cost=0.00..3.75 rows=1 width=402)
-> XN Unique (cost=0.00..3.74 rows=1 width=29)
-> XN Seq Scan on other_channel_list (cost=0.00..3.74 rows=1 width=29)
Filter: (job_key = 14)
Is it possible to get this to work? My ultimate goal is to turn this into a view so pre-defining my list of channel_keys won't work.
Edit to provide more context:
This is part of a larger query and the results of this get hash joined to some other data. If I hard-code the channel_keys then the input to the hash join is ~2 million rows. If I use the IN condition with the subquery (nothing else changes) then the input to the hash join is 400 million rows. The total query time goes from ~40 seconds to 15+ minutes.
Does this give you a better plan than the subquery version?
with other_channels as (
select distinct channel_key from other_channel_list where job_key = 14 order by 1
)
SELECT *
FROM channels scd
JOIN other_channels ocd on scd.channel_key = ocd.channel_key
WHERE scd.job_key = 1