we're building a translation editor and one of the main use cases is to find similar translations in the database. The main entities are: segment, translation_record and user. Segment can be either source or target segment (text) with translation_record being the connecting entity.
For similarity we use pg_trgm.
We have these indices implemented:
CREATE INDEX IF NOT EXISTS segment_content_gin ON segments USING gin (content gin_trgm_ops);
CREATE INDEX IF NOT EXISTS segment_language_id_idx ON segments USING btree (language_id);
CREATE INDEX IF NOT EXISTS translation_record_language_combination_idx ON translation_records USING btree (language_combination);
This is the query we use (note interpolated values as per ruby language):
set pg_trgm.similarity_threshold TO 0.#{sim_score};
SELECT SIMILARITY(segments.content, '#{source_for_lookup}') AS similarity,
translation_records.id AS translation_record_id,
translation_records.source_segment_id AS source_segment_id,
segments.content AS source_segment_content,
translation_records.target_segment_id AS target_segment_id,
target_segments.content AS target_segment_content,
creators.username AS created_by_username,
updaters.username AS updated_by_username,
translation_records.created_at,
translation_records.updated_at,
translation_records.project_name,
translation_records.import_comment,
translation_records.style_id,
translation_records.domain_id,
segments.language_id AS source_language_id,
target_segments.language_id AS target_language_id
FROM segments
JOIN translation_records
ON segments.id = translation_records.source_segment_id
JOIN segments AS target_segments
ON translation_records.target_segment_id = target_segments.id
JOIN users AS creators
ON translation_records.created_by = creators.id
LEFT JOIN users AS updaters
ON translation_records.updated_by = updaters.id
WHERE segments.content % '#{source_for_lookup}'
AND translation_records.language_combination = '#{lang_lookup_combo}'
ORDER BY SIMILARITY(segments.content, '#{source_for_lookup}') DESC
LIMIT #{max_results};
The execution time on my dev laptop on a 4.7 M segments is around 400 ms. My question is, can I further optimize this query by using joins and WHERE differently or by making any other changes?
EDIT: explain (buffers, analyze) with ordering by similarity
"Limit (cost=59749.56..59750.14 rows=5 width=356) (actual time=458.808..462.364 rows=2 loops=1)"
" Buffers: shared hit=15821 read=37693"
" I/O Timings: read=58.698"
" -> Gather Merge (cost=59749.56..59774.99 rows=218 width=356) (actual time=458.806..462.360 rows=2 loops=1)"
" Workers Planned: 2"
" Workers Launched: 2"
" Buffers: shared hit=15821 read=37693"
" I/O Timings: read=58.698"
" -> Sort (cost=58749.53..58749.81 rows=109 width=356) (actual time=434.602..434.606 rows=1 loops=3)"
" Sort Key: (similarity(segments.content, 'Coop Himmelb(l)au, Vienna, Austria'::text)) DESC"
" Sort Method: quicksort Memory: 25kB"
" Buffers: shared hit=15821 read=37693"
" I/O Timings: read=58.698"
" Worker 0: Sort Method: quicksort Memory: 25kB"
" Worker 1: Sort Method: quicksort Memory: 25kB"
" -> Hash Left Join (cost=4326.38..58747.72 rows=109 width=356) (actual time=433.628..434.588 rows=1 loops=3)"
" Hash Cond: (translation_records.updated_by = updaters.id)"
" Buffers: shared hit=15805 read=37693"
" I/O Timings: read=58.698"
" -> Nested Loop (cost=4309.86..58730.64 rows=109 width=324) (actual time=433.603..434.562 rows=1 loops=3)"
" Buffers: shared hit=15803 read=37693"
" I/O Timings: read=58.698"
" -> Nested Loop (cost=4309.70..58727.69 rows=109 width=296) (actual time=433.593..434.551 rows=1 loops=3)"
" Buffers: shared hit=15798 read=37693"
" I/O Timings: read=58.698"
" -> Hash Join (cost=4309.27..58658.80 rows=109 width=174) (actual time=433.578..434.535 rows=1 loops=3)"
" Hash Cond: (translation_records.source_segment_id = segments.id)"
" Buffers: shared hit=15789 read=37693"
" I/O Timings: read=58.698"
" -> Parallel Seq Scan on translation_records (cost=0.00..51497.78 rows=1086382 width=52) (actual time=0.024..145.197 rows=869773 loops=3)"
" Filter: (language_combination = '2_1'::text)"
" Buffers: shared hit=225 read=37693"
" I/O Timings: read=58.698"
" -> Hash (cost=4303.61..4303.61 rows=453 width=126) (actual time=229.792..229.793 rows=2 loops=3)"
" Buckets: 1024 Batches: 1 Memory Usage: 9kB"
" Buffers: shared hit=15558"
" -> Bitmap Heap Scan on segments (cost=2575.51..4303.61 rows=453 width=126) (actual time=225.687..229.789 rows=2 loops=3)"
" Recheck Cond: (content % 'Coop Himmelb(l)au, Vienna, Austria'::text)"
" Rows Removed by Index Recheck: 63"
" Heap Blocks: exact=60"
" Buffers: shared hit=15558"
" -> Bitmap Index Scan on segment_content_gin (cost=0.00..2575.40 rows=453 width=0) (actual time=225.653..225.653 rows=65 loops=3)"
" Index Cond: (content % 'Coop Himmelb(l)au, Vienna, Austria'::text)"
" Buffers: shared hit=15378"
" -> Index Scan using segments_pkey on segments target_segments (cost=0.43..0.63 rows=1 width=126) (actual time=0.019..0.019 rows=1 loops=2)"
" Index Cond: (id = translation_records.target_segment_id)"
" Buffers: shared hit=9"
" -> Memoize (cost=0.16..0.18 rows=1 width=36) (actual time=0.012..0.013 rows=1 loops=2)"
" Cache Key: translation_records.created_by"
" Cache Mode: logical"
" Hits: 0 Misses: 1 Evictions: 0 Overflows: 0 Memory Usage: 1kB"
" Buffers: shared hit=5"
" Worker 0: Hits: 0 Misses: 1 Evictions: 0 Overflows: 0 Memory Usage: 1kB"
" -> Index Scan using users_pkey on users creators (cost=0.15..0.17 rows=1 width=36) (actual time=0.010..0.010 rows=1 loops=2)"
" Index Cond: (id = translation_records.created_by)"
" Buffers: shared hit=5"
" -> Hash (cost=12.90..12.90 rows=290 width=36) (actual time=0.014..0.014 rows=12 loops=2)"
" Buckets: 1024 Batches: 1 Memory Usage: 9kB"
" Buffers: shared hit=2"
" -> Seq Scan on users updaters (cost=0.00..12.90 rows=290 width=36) (actual time=0.010..0.011 rows=12 loops=2)"
" Buffers: shared hit=2"
"Planning:"
" Buffers: shared hit=28"
"Planning Time: 5.739 ms"
"Execution Time: 462.490 ms"
END EDIT
EDIT explain (buffers, analyze) without ordering
When I remove the ORDER BY line the query slows down. Also I went with GIN index due to % operator.
"Limit (cost=4310.00..5796.68 rows=5 width=356) (actual time=777.107..780.931 rows=2 loops=1)"
" Buffers: shared hit=5519 read=37597"
" I/O Timings: read=55.820"
" -> Nested Loop Left Join (cost=4310.00..81914.70 rows=261 width=356) (actual time=777.105..780.929 rows=2 loops=1)"
" Buffers: shared hit=5519 read=37597"
" I/O Timings: read=55.820"
" -> Nested Loop (cost=4309.85..81870.96 rows=261 width=324) (actual time=777.085..780.900 rows=2 loops=1)"
" Buffers: shared hit=5519 read=37597"
" I/O Timings: read=55.820"
" -> Nested Loop (cost=4309.70..81827.91 rows=261 width=296) (actual time=777.080..780.892 rows=2 loops=1)"
" Buffers: shared hit=5515 read=37597"
" I/O Timings: read=55.820"
" -> Hash Join (cost=4309.27..81662.97 rows=261 width=174) (actual time=777.062..780.869 rows=2 loops=1)"
" Hash Cond: (translation_records.source_segment_id = segments.id)"
" Buffers: shared hit=5507 read=37597"
" I/O Timings: read=55.820"
" -> Seq Scan on translation_records (cost=0.00..70509.48 rows=2607318 width=52) (actual time=0.019..387.974 rows=2609320 loops=1)"
" Filter: (language_combination = '2_1'::text)"
" Buffers: shared hit=321 read=37597"
" I/O Timings: read=55.820"
" -> Hash (cost=4303.61..4303.61 rows=453 width=126) (actual time=229.363..229.364 rows=2 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 9kB"
" Buffers: shared hit=5186"
" -> Bitmap Heap Scan on segments (cost=2575.51..4303.61 rows=453 width=126) (actual time=225.850..229.358 rows=2 loops=1)"
" Recheck Cond: (content % 'Coop Himmelb(l)au, Vienna, Austria'::text)"
" Rows Removed by Index Recheck: 63"
" Heap Blocks: exact=60"
" Buffers: shared hit=5186"
" -> Bitmap Index Scan on segment_content_gin (cost=0.00..2575.40 rows=453 width=0) (actual time=225.817..225.817 rows=65 loops=1)"
" Index Cond: (content % 'Coop Himmelb(l)au, Vienna, Austria'::text)"
" Buffers: shared hit=5126"
" -> Index Scan using segments_pkey on segments target_segments (cost=0.43..0.63 rows=1 width=126) (actual time=0.008..0.008 rows=1 loops=2)"
" Index Cond: (id = translation_records.target_segment_id)"
" Buffers: shared hit=8"
" -> Index Scan using users_pkey on users creators (cost=0.15..0.17 rows=1 width=36) (actual time=0.002..0.002 rows=1 loops=2)"
" Index Cond: (id = translation_records.created_by)"
" Buffers: shared hit=4"
" -> Index Scan using users_pkey on users updaters (cost=0.15..0.17 rows=1 width=36) (actual time=0.000..0.000 rows=0 loops=2)"
" Index Cond: (id = translation_records.updated_by)"
"Planning:"
" Buffers: shared hit=28"
"Planning Time: 4.569 ms"
"Execution Time: 781.066 ms"
EDIT 2
Based on the first answer and all comments I've created two additional indices: btree on translation_records.source_segment_id and on translation_records.target_segment_id. Then I also switched to GIST index for segments.content as well as used the <-> operator for ordering. The above query actually slowed down to 4.5 seconds.
It should be noted that the searched text has the 1908 row in the DB. Also, the similarity threshold is 0.45. The same query takes between 4.6 seconds for the last row in the DB. The ordering operator doesn't have any effect.
With the GIN index it went back to 407 ms, regardless of the ordering operator. It takes about 6 seconds for the last segment in the DB.
What I have overlooked before is that the similarity threshold has a huge impact on this. By changing it to 0.55 the time drops from 6 seconds to 1.2 seconds for the last row in the DB on my dev laptop.
END EDIT 2
Best, Seba
Because you are ordering by similarity, a gist index should be used.
Then, in the order by, use the distance operator (<->) instead of the similarity function, as the former makes use of the gist index.
Related
I am running PostgreSQL 13.4 and have a large table (~800M rows) for which I want to find the average and standard deviation of a column. I am running this query on two different servers running the same version of PostgreSQL with the same schema (verified by the diff tool in pgAdmin). The indexes on all the tables are identical.
The query I am running is as follows:
SELECT AVG("api_axle"."aoa") AS "mean",
STDDEV_POP("api_axle"."aoa") AS "std" FROM "api_axle"
INNER JOIN "api_train" ON ("api_axle"."train_id" = "api_train"."id")
INNER JOIN "api_direction" ON ("api_train"."direction_id" = "api_direction"."id")
INNER JOIN "api_track" ON ("api_direction"."track_id" = "api_track"."id")
INNER JOIN "api_site" ON ("api_track"."site_id" = "api_site"."id")
WHERE ("api_train"."time" >= '2022-06-12T19:43:32.164970+00:00'::timestamptz
AND ("api_train"."direction_id" = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid
OR "api_direction"."track_id" = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid
OR "api_track"."site_id" = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid
OR "api_site"."railroad_id" = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid)
AND NOT ("api_axle"."aoa" IS NULL) AND "api_axle"."bogie_id" IS NULL)
On the slow run, execution takes around 3m and on the fast server under 100ms. Both servers have comparable hardware although the disk on the slow running one is around 2.5x slower than the fast one.
I would appreciate any insight on what could be causing this discrepancy in query plan and performance.
EDIT:
Result of EXPLAIN (ANALYZE, BUFFERS):
Fast Server:
"Finalize Aggregate (cost=7527555.19..7527555.20 rows=1 width=64) (actual time=313.413..317.169 rows=1 loops=1)"
" Buffers: shared hit=1607 read=4627 written=651"
" -> Gather (cost=7527554.95..7527555.16 rows=2 width=64) (actual time=312.562..317.140 rows=3 loops=1)"
" Workers Planned: 2"
" Workers Launched: 2"
" Buffers: shared hit=1607 read=4627 written=651"
" -> Partial Aggregate (cost=7526554.95..7526554.96 rows=1 width=64) (actual time=293.727..293.762 rows=1 loops=3)"
" Buffers: shared hit=1607 read=4627 written=651"
" -> Nested Loop (cost=408.44..7526548.57 rows=1276 width=4) (actual time=201.987..289.682 rows=3212 loops=3)"
" Buffers: shared hit=1607 read=4627 written=651"
" -> Hash Join (cost=82.87..9095.86 rows=78 width=16) (actual time=201.709..264.799 rows=222 loops=3)"
" Hash Cond: (api_track.site_id = api_site.id)"
" Join Filter: ((api_train.direction_id = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid) OR (api_direction.track_id = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid) OR (api_track.site_id = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid) OR (api_site.railroad_id = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid))"
" Rows Removed by Join Filter: 8726"
" Buffers: shared hit=165 read=235 written=19"
" -> Hash Join (cost=81.10..9090.13 rows=1493 width=64) (actual time=0.794..52.878 rows=8948 loops=3)"
" Hash Cond: (api_direction.track_id = api_track.id)"
" Buffers: shared hit=99 read=235 written=19"
" -> Hash Join (cost=79.13..9083.85 rows=1493 width=48) (actual time=0.660..32.094 rows=8948 loops=3)"
" Hash Cond: (api_train.direction_id = api_direction.id)"
" Buffers: shared hit=96 read=235 written=19"
" -> Parallel Bitmap Heap Scan on api_train (cost=76.20..9076.81 rows=1493 width=32) (actual time=0.433..11.273 rows=8948 loops=3)"
" Recheck Cond: (""time"" >= '2022-06-12 19:43:32.16497+00'::timestamp with time zone)"
" Heap Blocks: exact=96"
" Buffers: shared hit=93 read=235 written=19"
" -> Bitmap Index Scan on api_train_time_7204a1a7 (cost=0.00..75.30 rows=3583 width=0) (actual time=1.187..1.188 rows=26877 loops=1)"
" Index Cond: (""time"" >= '2022-06-12 19:43:32.16497+00'::timestamp with time zone)"
" Buffers: shared hit=21 read=49"
" -> Hash (cost=1.86..1.86 rows=86 width=32) (actual time=0.219..0.223 rows=88 loops=3)"
" Buckets: 1024 Batches: 1 Memory Usage: 14kB"
" Buffers: shared hit=3"
" -> Seq Scan on api_direction (cost=0.00..1.86 rows=86 width=32) (actual time=0.010..0.111 rows=88 loops=3)"
" Buffers: shared hit=3"
" -> Hash (cost=1.43..1.43 rows=43 width=32) (actual time=0.124..0.128 rows=44 loops=3)"
" Buckets: 1024 Batches: 1 Memory Usage: 11kB"
" Buffers: shared hit=3"
" -> Seq Scan on api_track (cost=0.00..1.43 rows=43 width=32) (actual time=0.018..0.069 rows=44 loops=3)"
" Buffers: shared hit=3"
" -> Hash (cost=1.34..1.34 rows=34 width=32) (actual time=200.592..200.596 rows=35 loops=3)"
" Buckets: 1024 Batches: 1 Memory Usage: 11kB"
" Buffers: shared hit=3"
" -> Seq Scan on api_site (cost=0.00..1.34 rows=34 width=32) (actual time=200.494..200.538 rows=35 loops=3)"
" Buffers: shared hit=3"
" -> Bitmap Heap Scan on api_axle (cost=325.57..96365.91 rows=1169 width=20) (actual time=0.027..0.076 rows=14 loops=665)"
" Recheck Cond: (train_id = api_train.id)"
" Filter: ((aoa IS NOT NULL) AND (bogie_id IS NULL))"
" Rows Removed by Filter: 233"
" Heap Blocks: exact=1142"
" Buffers: shared hit=1442 read=4392 written=632"
" -> Bitmap Index Scan on api_axle_train_id_8f2bba76 (cost=0.00..325.28 rows=25743 width=0) (actual time=0.018..0.018 rows=248 loops=665)"
" Index Cond: (train_id = api_train.id)"
" Buffers: shared hit=1408 read=1254 written=177"
"Planning:"
" Buffers: shared hit=501 read=9"
"Planning Time: 9.733 ms"
"JIT:"
" Functions: 119"
" Options: Inlining true, Optimization true, Expressions true, Deforming true"
" Timing: Generation 7.455 ms, Inlining 123.240 ms, Optimization 283.648 ms, Emission 193.846 ms, Total 608.189 ms"
"Execution Time: 369.168 ms"
Slow Server:
"Finalize Aggregate (cost=15629658.70..15629658.71 rows=1 width=64) (actual time=193760.549..193863.020 rows=1 loops=1)"
" Buffers: shared hit=991 read=12278213 dirtied=6 written=5"
" -> Gather (cost=15629658.46..15629658.67 rows=2 width=64) (actual time=193753.058..193862.932 rows=3 loops=1)"
" Workers Planned: 2"
" Workers Launched: 2"
" Buffers: shared hit=991 read=12278213 dirtied=6 written=5"
" -> Partial Aggregate (cost=15628658.46..15628658.47 rows=1 width=64) (actual time=193727.174..193727.188 rows=1 loops=3)"
" Buffers: shared hit=991 read=12278213 dirtied=6 written=5"
" -> Hash Join (cost=15039.93..15628644.49 rows=2793 width=4) (actual time=14030.963..193705.905 rows=3216 loops=3)"
" Hash Cond: (api_track.site_id = api_site.id)"
" Join Filter: ((api_train.direction_id = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid) OR (api_direction.track_id = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid) OR (api_track.site_id = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid) OR (api_site.railroad_id = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid))"
" Rows Removed by Join Filter: 116510"
" Buffers: shared hit=991 read=12278213 dirtied=6 written=5"
" -> Hash Join (cost=15038.17..15628500.30 rows=53735 width=52) (actual time=13700.147..193347.236 rows=119726 loops=3)"
" Hash Cond: (api_direction.track_id = api_track.id)"
" Buffers: shared hit=928 read=12278213 dirtied=6 written=5"
" -> Hash Join (cost=15036.20..15628343.37 rows=53735 width=36) (actual time=13700.089..193314.073 rows=119726 loops=3)"
" Hash Cond: (api_train.direction_id = api_direction.id)"
" Buffers: shared hit=925 read=12278213 dirtied=6 written=5"
" -> Parallel Hash Join (cost=15033.27..15628192.43 rows=53735 width=20) (actual time=13700.031..193279.908 rows=119726 loops=3)"
" Hash Cond: (api_axle.train_id = api_train.id)"
" Buffers: shared hit=922 read=12278213 dirtied=6 written=5"
" -> Parallel Seq Scan on api_axle (cost=0.00..15574239.21 rows=14826618 width=20) (actual time=2.209..190831.636 rows=12222952 loops=3)"
" Filter: ((aoa IS NOT NULL) AND (bogie_id IS NULL))"
" Rows Removed by Filter: 251852992"
" Buffers: shared hit=911 read=12277904 dirtied=6 written=5"
" -> Parallel Hash (cost=14990.80..14990.80 rows=3397 width=32) (actual time=4.012..4.015 rows=8650 loops=3)"
" Buckets: 32768 (originally 8192) Batches: 1 (originally 1) Memory Usage: 2080kB"
" Buffers: shared hit=11 read=309"
" -> Parallel Bitmap Heap Scan on api_train (cost=147.61..14990.80 rows=3397 width=32) (actual time=1.218..5.323 rows=25949 loops=1)"
" Recheck Cond: (""time"" >= '2022-06-12 19:43:32.16497+00'::timestamp with time zone)"
" Heap Blocks: exact=251"
" Buffers: shared hit=11 read=309"
" -> Bitmap Index Scan on api_train_time_7204a1a7 (cost=0.00..145.57 rows=8152 width=0) (actual time=1.171..1.172 rows=25990 loops=1)"
" Index Cond: (""time"" >= '2022-06-12 19:43:32.16497+00'::timestamp with time zone)"
" Buffers: shared hit=5 read=64"
" -> Hash (cost=1.86..1.86 rows=86 width=32) (actual time=0.050..0.052 rows=88 loops=3)"
" Buckets: 1024 Batches: 1 Memory Usage: 14kB"
" Buffers: shared hit=3"
" -> Seq Scan on api_direction (cost=0.00..1.86 rows=86 width=32) (actual time=0.018..0.028 rows=88 loops=3)"
" Buffers: shared hit=3"
" -> Hash (cost=1.43..1.43 rows=43 width=32) (actual time=0.046..0.047 rows=44 loops=3)"
" Buckets: 1024 Batches: 1 Memory Usage: 11kB"
" Buffers: shared hit=3"
" -> Seq Scan on api_track (cost=0.00..1.43 rows=43 width=32) (actual time=0.030..0.035 rows=44 loops=3)"
" Buffers: shared hit=3"
" -> Hash (cost=1.34..1.34 rows=34 width=32) (actual time=324.272..324.273 rows=35 loops=3)"
" Buckets: 1024 Batches: 1 Memory Usage: 11kB"
" Buffers: shared hit=3"
" -> Seq Scan on api_site (cost=0.00..1.34 rows=34 width=32) (actual time=324.238..324.249 rows=35 loops=3)"
" Buffers: shared hit=3"
"Planning:"
" Buffers: shared hit=41 read=8"
"Planning Time: 6.060 ms"
"JIT:"
" Functions: 128"
" Options: Inlining true, Optimization true, Expressions true, Deforming true"
" Timing: Generation 11.373 ms, Inlining 119.493 ms, Optimization 521.358 ms, Emission 330.504 ms, Total 982.728 ms"
"Execution Time: 193867.165 ms"
api_train table definition:
Table "public.api_train"
Column | Type | Collation | Nullable | Default
--------------+--------------------------+-----------+----------+---------
id | uuid | | not null |
time | timestamp with time zone | | not null |
direction_id | uuid | | not null |
error_id | integer | | |
Indexes:
"api_train_pkey" PRIMARY KEY, btree (id)
"api_train_direction_id_49569dab" btree (direction_id)
"api_train_error_id_6312c8c6" btree (error_id)
"api_train_time_7204a1a7" btree ("time")
"unique_site_track_direction_time" UNIQUE CONSTRAINT, btree (direction_id, "time")
Foreign-key constraints:
"api_train_direction_id_49569dab_fk_api_direction_id" FOREIGN KEY (direction_id) REFERENCES api_direction(id) DEFERRABLE INITIALLY DEFERRED
"api_train_error_id_6312c8c6_fk_api_bg6rejectioncode_code" FOREIGN KEY (error_id) REFERENCES api_bg6rejectioncode(code) DEFERRABLE INITIALLY DEFERRED
Referenced by:
TABLE "api_axle" CONSTRAINT "api_axle_train_id_8f2bba76_fk_api_train_id" FOREIGN KEY (train_id) REFERENCES api_train(id) DEFERRABLE INITIALLY DEFERRED
TABLE "api_bogie" CONSTRAINT "api_bogie_train_id_089c4f60_fk_api_train_id" FOREIGN KEY (train_id) REFERENCES api_train(id) DEFERRABLE INITIALLY DEFERRED
TABLE "api_trainmodule" CONSTRAINT "api_trainmodule_train_id_9711466e_fk_api_train_id" FOREIGN KEY (train_id) REFERENCES api_train(id) DEFERRABLE INITIALLY DEFERRED
EDIT:
The solution that worked for me in the end was to update the statistics target for the affected columns like so:
ALTER TABLE api_axle
ALTER COLUMN aoa SET STATISTICS 500;
ANALYZE VERBOSE api_axle;
Having better statistics was enough to help the query planner choose the superior plan that it was using on the fast server.
Part of the problem is at least that in the first (fast) plan, the index scan on api_axle is overestimated by a factor of 100. That probably makes that good plan look worse than it actually is (compare the total cost estimates).
You could improve the estimate by collecting more detailed statistics:
ALTER TABLE api_axle ALTER COLUMN aoa SET STATISTICS 1000;
ANALYZE api_axle;
If that isn't enough, try to create an index that makes the first type of plan seem even more attractive, in the hope that the optimizer will choose it:
CREATE INDEX ON api_axle (train_id) INCLUDE (aoa)
WHERE aoa IS NOT NULL AND bogie_id IS NULL;
Then VACUUM api_axle and try again. I am hoping for a fast index-only scan.
I am a newbie to database optimisations,I have is around 29 million rows,it takes 13 seconds. What can I do to optimize performance?
"Properties" column is int array. I created a GIN index on F."Properties",
SELECT
F. "Id",
F. "Name",
F. "Url",
F. "CountryModel",
F. "Properties",
F. "PageRank",
F. "IsVerify",
count(*) AS Counter
FROM
public. "Firms" F,
LATERAL unnest(F."Properties") AS P
WHERE
F. "CountryId" = 1
AND P = ANY (ARRAY[126,128])
AND "Properties" && ARRAY[126,128]
AND F. "Deleted" = FALSE
GROUP BY
F. "Id"
ORDER BY
F. "IsVerify" DESC,
Counter DESC,
F. "PageRank" DESC OFFSET 0 ROWS FETCH FIRST 100 ROW ONLY```
Thats My Query Plan Analyze
"Limit (cost=801718.65..801718.70 rows=20 width=368) (actual time=12671.277..12674.826 rows=20 loops=1)"
" -> Sort (cost=801718.65..802180.37 rows=184689 width=368) (actual time=12671.276..12674.824 rows=20 loops=1)"
" Sort Key: f.""IsVerify"" DESC, (count(*)) DESC, f.""PageRank"" DESC"
" Sort Method: top-N heapsort Memory: 47kB"
" -> GroupAggregate (cost=763260.63..796804.14 rows=184689 width=368) (actual time=12284.752..12592.010 rows=201352 loops=1)"
" Group Key: f.""Id"""
" -> Nested Loop (cost=763260.63..793110.36 rows=369378 width=360) (actual time=12284.734..12488.106 rows=205124 loops=1)"
" -> Gather Merge (cost=763260.62..784770.69 rows=184689 width=360) (actual time=12284.716..12389.961 rows=201352 loops=1)"
" Workers Planned: 2"
" Workers Launched: 2"
" -> Sort (cost=762260.59..762452.98 rows=76954 width=360) (actual time=12258.175..12309.931 rows=67117 loops=3)"
" Sort Key: f.""Id"""
" Sort Method: external merge Disk: 35432kB"
" Worker 0: Sort Method: external merge Disk: 35536kB"
" Worker 1: Sort Method: external merge Disk: 35416kB"
" -> Parallel Bitmap Heap Scan on ""Firms"" f (cost=1731.34..743387.12 rows=76954 width=360) (actual time=57.500..12167.222 rows=67117 loops=3)"
" Recheck Cond: (""Properties"" && '{126,128}'::integer[])"
" Rows Removed by Index Recheck: 356198"
" Filter: ((NOT ""Deleted"") AND (""CountryId"" = 1))"
" Heap Blocks: exact=17412 lossy=47209"
" -> Bitmap Index Scan on ix_properties_gin (cost=0.00..1685.17 rows=184689 width=0) (actual time=61.628..61.628 rows=201354 loops=1)"
" Index Cond: (""Properties"" && '{126,128}'::integer[])"
" -> Memoize (cost=0.01..0.14 rows=2 width=0) (actual time=0.000..0.000 rows=1 loops=201352)"
" Cache Key: f.""Properties"""
" Hits: 179814 Misses: 21538 Evictions: 0 Overflows: 0 Memory Usage: 3076kB"
" -> Function Scan on unnest p (cost=0.00..0.13 rows=2 width=0) (actual time=0.001..0.001 rows=1 loops=21538)"
" Filter: (p = ANY ('{126,128}'::integer[]))"
" Rows Removed by Filter: 6"
"Planning Time: 2.542 ms"
"Execution Time: 12675.382 ms"
Thats EXPLAIN (ANALYZE, BUFFERS) result
"Limit (cost=793826.15..793826.20 rows=20 width=100) (actual time=12879.468..12882.414 rows=20 loops=1)"
" Buffers: shared hit=108 read=194121 written=1, temp read=3685 written=3697"
" -> Sort (cost=793826.15..794287.87 rows=184689 width=100) (actual time=12879.468..12882.412 rows=20 loops=1)"
" Sort Key: f.""IsVerify"" DESC, (count(*)) DESC, f.""PageRank"" DESC"
" Sort Method: top-N heapsort Memory: 29kB"
" Buffers: shared hit=108 read=194121 written=1, temp read=3685 written=3697"
" -> GroupAggregate (cost=755368.13..788911.64 rows=184689 width=100) (actual time=12623.980..12845.122 rows=201352 loops=1)"
" Group Key: f.""Id"""
" Buffers: shared hit=108 read=194121 written=1, temp read=3685 written=3697"
" -> Nested Loop (cost=755368.13..785217.86 rows=369378 width=92) (actual time=12623.971..12785.946 rows=205124 loops=1)"
" Buffers: shared hit=108 read=194121 written=1, temp read=3685 written=3697"
" -> Gather Merge (cost=755368.12..776878.19 rows=184689 width=120) (actual time=12623.945..12680.899 rows=201352 loops=1)"
" Workers Planned: 2"
" Workers Launched: 2"
" Buffers: shared hit=108 read=194121 written=1, temp read=3685 written=3697"
" -> Sort (cost=754368.09..754560.48 rows=76954 width=120) (actual time=12613.425..12624.658 rows=67117 loops=3)"
" Sort Key: f.""Id"""
" Sort Method: external merge Disk: 9848kB"
" Buffers: shared hit=108 read=194121 written=1, temp read=3685 written=3697"
" Worker 0: Sort Method: external merge Disk: 9824kB"
" Worker 1: Sort Method: external merge Disk: 9808kB"
" -> Parallel Bitmap Heap Scan on ""Firms"" f (cost=1731.34..743387.12 rows=76954 width=120) (actual time=42.098..12567.883 rows=67117 loops=3)"
" Recheck Cond: (""Properties"" && '{126,128}'::integer[])"
" Rows Removed by Index Recheck: 356198"
" Filter: ((NOT ""Deleted"") AND (""CountryId"" = 1))"
" Heap Blocks: exact=17323 lossy=47429"
" Buffers: shared hit=97 read=194118 written=1"
" -> Bitmap Index Scan on ix_properties_gin (cost=0.00..1685.17 rows=184689 width=0) (actual time=41.862..41.862 rows=201354 loops=1)"
" Index Cond: (""Properties"" && '{126,128}'::integer[])"
" Buffers: shared hit=4 read=74"
" -> Memoize (cost=0.01..0.14 rows=2 width=0) (actual time=0.000..0.000 rows=1 loops=201352)"
" Cache Key: f.""Properties"""
" Hits: 179814 Misses: 21538 Evictions: 0 Overflows: 0 Memory Usage: 3076kB"
" -> Function Scan on unnest p (cost=0.00..0.13 rows=2 width=0) (actual time=0.001..0.001 rows=1 loops=21538)"
" Filter: (p = ANY ('{126,128}'::integer[]))"
" Rows Removed by Filter: 6"
"Planning:"
" Buffers: shared hit=32 read=6 dirtied=1"
"Planning Time: 4.533 ms"
"Execution Time: 12883.604 ms"
You should increase work_mem to get rid of the lossy pages in the bitmap. I don't think this will make a big difference, because I suspect most of your time is going to read the pages from disk, and converting lossy pages to exact pages doesn't change how many pages get read (unless TOAST is involved, which I suspect is not--how large does the "Properties" array get?). But I might be wrong, so try it and see. Also, if you turn on track_io_timing and collect your plans with EXPLAIN (ANALYZE, BUFFERS), then we could immediately see if the IO read time was the problem.
Beyond that, this looks very hard to optimize with traditional methods. You can usually optimize ORDER BY...LIMIT by using an index to read rows already in order, but since the 2nd column in your ordering is computed dynamically, this is unlikely here. Are values within "Properties" unique? So can 126 and 128 each exist and be counted at most once per row, or can they exist and be counted multiple times?
The easiest way to optimize this might be on the app or business end. Do we really need to run this query at all, and why? What if we queried only "IsVerify" is true, rather than sorting by it? If that only returns 95 rows, is it really necessary to go back and fill in 5 more with "IsVerify" is false?, etc.
I am having a table with 2M rows and running a query using 5 columns all of them indexed. Still the query execution time is more
Query:
SELECT cmp_domain as domain, slug, cmp_name as company_name, prod_categories, prod_sub_categories, cmp_web_traff_rank
FROM prospects_v5.commercepedia
WHERE
country='United States of America'
AND 'Shopify' =ANY (technologies)
AND is_live = true
OR 'General Merchandise' =ANY(prod_categories)
order by cmp_web_traff_rank
LIMIT 10
OFFSET 30000;
Below is the explain Plan:
" -> Gather Merge (cost=394508.12..401111.22 rows=56594 width=109) (actual time=14538.165..14557.052 rows=30010 loops=1)"
" Workers Planned: 2"
" Workers Launched: 2"
" -> Sort (cost=393508.10..393578.84 rows=28297 width=109) (actual time=14520.435..14523.376 rows=10175 loops=3)"
" Sort Key: cmp_web_traff_rank"
" Sort Method: external merge Disk: 3896kB"
" Worker 0: Sort Method: external merge Disk: 4056kB"
" Worker 1: Sort Method: external merge Disk: 4096kB"
" -> Parallel Seq Scan on commercepedia (cost=0.00..391415.77 rows=28297 width=109) (actual time=6.726..14439.953 rows=32042 loops=3)"
" Filter: (((country = 'United States of America'::text) AND ('Shopify'::text = ANY (technologies)) AND is_live) OR ('General Merchandise'::text = ANY (prod_categories)))"
" Rows Removed by Filter: 459792"
"Planning Time: 0.326 ms"
"Execution Time: 14559.593 ms"
I have created btree index on country ,is_live and cmp_web_traff_rank and gin index on technologies and prod_categories.
When I use AND condition for all columns below is the explain plan
" -> Gather Merge (cost=269444.76..269511.27 rows=570 width=109) (actual time=10780.530..10785.326 rows=1672 loops=1)"
" Workers Planned: 2"
" Workers Launched: 2"
" -> Sort (cost=268444.74..268445.45 rows=285 width=109) (actual time=10762.765..10762.862 rows=557 loops=3)"
" Sort Key: cmp_web_traff_rank"
" Sort Method: quicksort Memory: 125kB"
" Worker 0: Sort Method: quicksort Memory: 133kB"
" Worker 1: Sort Method: quicksort Memory: 124kB"
" -> Parallel Bitmap Heap Scan on commercepedia (cost=19489.58..268433.12 rows=285 width=109) (actual time=318.652..10759.284 rows=557 loops=3)"
" Recheck Cond: (country = 'United States of America'::text)"
" Rows Removed by Index Recheck: 18486"
" Filter: (is_live AND ('Shopify'::text = ANY (technologies)) AND ('General Merchandise'::text = ANY (prod_categories)))"
" Rows Removed by Filter: 80120"
" Heap Blocks: exact=18391 lossy=10838"
" -> BitmapAnd (cost=19489.58..19489.58 rows=107598 width=0) (actual time=259.181..259.183 rows=0 loops=1)"
" -> Bitmap Index Scan on idx_is_live (cost=0.00..4944.53 rows=267214 width=0) (actual time=52.584..52.584 rows=271711 loops=1)"
" Index Cond: (is_live = true)"
" -> Bitmap Index Scan on idx_country (cost=0.00..14544.45 rows=594137 width=0) (actual time=199.594..199.594 rows=593938 loops=1)"
" Index Cond: (country = 'United States of America'::text)"
"Planning Time: 0.243 ms"
"Execution Time: 10790.385 ms"
Is there any way I can improve the query performance further.
Here are some low hanging fruits:
try increasing work_mem to eliminate lossy bitmap heap scan. You can inspect it and set it with:
SHOW work_mem;
SET work_mem = xx;
If applicable drop index on is_live and create other indices as partial conditioned on is_live.
The plan does not use your gin indices right now. Try using operators that uses gin index like
'Shopify' =ANY (technologies) -> technologies #> '{Shopify}' ;
I am quite new to optimizing the speed of a select, but I have the one below which is time consuming. I would be grateful for suggestions to improve performance.
SELECT DISTINCT p.id "pub_id",
p.submission_year,
pip.level,
mv_s.integer_value "total_citations",
1 "count_pub"
FROM publication p
JOIN organisation_association "oa" ON (oa.publication_id = p.id
AND oa.organisation_id IN (249189578,
249189824))
JOIN bfi_2017 "pip" ON (p.uuid = pip.uuid
AND pip.bfi_score > 0
AND pip.bfi_score IS NOT NULL)
LEFT JOIN metric_value mv_s ON (mv_s.name = 'citations'
AND EXISTS
(SELECT *
FROM publication_metrics pm_s
JOIN metrics m_s ON (m_s.id = pm_s.metrics_id
AND m_s.source_id = 210247389
AND pm_s.publication_id = p.id
AND mv_s.metrics_id = m_s.id)))
WHERE p.peer_review = 'true'
AND (p.type_classification_id IN (57360320,
57360322,
57360324,
57360326,
57360350))
AND p.submission_year = 2017
Execute plan:
"Unique (cost=532129954.32..532286422.32 rows=4084080 width=24) (actual time=1549616.424..1549616.582 rows=699 loops=1)"
" Buffers: shared read=27411, temp read=1774656 written=2496"
" -> Sort (cost=532129954.32..532169071.32 rows=15646800 width=24) (actual time=1549616.422..1549616.445 rows=712 loops=1)"
" Sort Key: p.id, pip.level, mv_s.integer_value"
" Sort Method: quicksort Memory: 80kB"
" Buffers: shared read=27411, temp read=1774656 written=2496"
" -> Nested Loop Left Join (cost=393.40..529618444.45 rows=15646800 width=24) (actual time=1832.122..1549614.196 rows=712 loops=1)"
" Join Filter: (SubPlan 1)"
" Rows Removed by Join Filter: 607313310"
" Buffers: shared read=27411, temp read=1774656 written=2496"
" -> Nested Loop (cost=393.40..8704.01 rows=37 width=16) (actual time=5.470..125.773 rows=712 loops=1)"
" Buffers: shared hit=20313 read=4585"
" -> Hash Join (cost=392.97..7886.65 rows=72 width=16) (actual time=5.160..77.182 rows=3417 loops=1)"
" Hash Cond: ((p.uuid)::text = (pip.uuid)::text)"
" Buffers: shared hit=2 read=3670"
" -> Bitmap Heap Scan on publication p (cost=160.30..7643.44 rows=2618 width=49) (actual time=2.335..67.546 rows=4527 loops=1)"
" Recheck Cond: (submission_year = 2017)"
" Filter: (peer_review AND (type_classification_id = ANY ('{57360320,57360322,57360324,57360326,57360350}'::bigint[])))"
" Rows Removed by Filter: 3975"
" Heap Blocks: exact=3556"
" Buffers: shared hit=2 read=3581"
" -> Bitmap Index Scan on idx_in2ix3rvuzxxf76bsipgn4l4sy (cost=0.00..159.64 rows=8430 width=0) (actual time=1.784..1.784 rows=8502 loops=1)"
" Index Cond: (submission_year = 2017)"
" Buffers: shared read=27"
" -> Hash (cost=181.61..181.61 rows=4085 width=41) (actual time=2.787..2.787 rows=4085 loops=1)"
" Buckets: 4096 Batches: 1 Memory Usage: 324kB"
" Buffers: shared read=89"
" -> Seq Scan on bfi_2017 pip (cost=0.00..181.61 rows=4085 width=41) (actual time=0.029..2.034 rows=4085 loops=1)"
" Filter: ((bfi_score IS NOT NULL) AND (bfi_score > '0'::double precision))"
" Rows Removed by Filter: 3324"
" Buffers: shared read=89"
" -> Index Only Scan using org_ass_publication_idx on organisation_association oa (cost=0.43..11.34 rows=1 width=8) (actual time=0.011..0.012 rows=0 loops=3417)"
" Index Cond: ((publication_id = p.id) AND (organisation_id = ANY ('{249189578,249189824}'::bigint[])))"
" Heap Fetches: 712"
" Buffers: shared hit=20311 read=915"
" -> Materialize (cost=0.00..53679.95 rows=845773 width=12) (actual time=0.012..93.456 rows=852969 loops=712)"
" Buffers: shared read=20873, temp read=1774656 written=2496"
" -> Seq Scan on metric_value mv_s (cost=0.00..45321.09 rows=845773 width=12) (actual time=0.043..470.590 rows=852969 loops=1)"
" Filter: ((name)::text = 'citations'::text)"
" Rows Removed by Filter: 1102878"
" Buffers: shared read=20873"
" SubPlan 1"
" -> Nested Loop (cost=0.85..16.91 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=607313928)"
" Buffers: shared read=1953"
" -> Index Scan using idx_w4wbsbxcqvjmqu64ubjlmqywdy on publication_metrics pm_s (cost=0.43..8.45 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=607313928)"
" Index Cond: (metrics_id = mv_s.metrics_id)"
" Filter: (publication_id = p.id)"
" Rows Removed by Filter: 1"
" -> Index Scan using metrics_pkey on metrics m_s (cost=0.43..8.45 rows=1 width=8) (actual time=0.027..0.027 rows=0 loops=3108)"
" Index Cond: (id = mv_s.metrics_id)"
" Filter: (source_id = 210247389)"
" Rows Removed by Filter: 1"
" Buffers: shared hit=10496 read=1953"
"Planning Time: 1.833 ms"
"Execution Time: 1549621.523 ms"
We are using PostgreSQL 9.5.2
We have 11 tables with around average of 10K records in each table
One of the table contains text column for which maximum content size is 12K characters.
When we exclude text column from select statement, it comes in around 5 seconds, and when we include text column, it take around 55 seconds. if we select any other column from same table, it works fine, but as soon as we take text column, performance goes on toss.
All tables are inner joined.
Can you please suggest on how to solve this?
Explain output shows 378ms but in real, it take around 1 minute to get these data.
so when we exclude text column from "ic" table, get result in 4-5 seconds.
"Nested Loop Left Join (cost=4.04..156.40 rows=10 width=616) (actual time=3.092..377.128 rows=24118 loops=1)"
" -> Nested Loop Left Join (cost=3.90..59.92 rows=7 width=603) (actual time=2.834..110.842 rows=14325 loops=1)"
" -> Nested Loop Left Join (cost=3.76..58.56 rows=7 width=604) (actual time=2.832..101.481 rows=12340 loops=1)"
" -> Nested Loop (cost=3.62..57.19 rows=7 width=590) (actual time=2.830..90.614 rows=8436 loops=1)"
" Join Filter: (i."Id" = ic."ImId")"
" -> Nested Loop (cost=3.33..51.42 rows=7 width=210) (actual time=2.807..65.782 rows=8436 loops=1)"
" -> Nested Loop (cost=3.19..50.21 rows=7 width=187) (actual time=2.424..54.596 rows=8436 loops=1)"
" -> Nested Loop (cost=2.77..46.16 rows=7 width=175) (actual time=1.944..32.056 rows=8436 loops=1)"
" -> Nested Loop (cost=2.35..23.66 rows=5 width=87) (actual time=1.750..1.877 rows=4 loops=1)"
" -> Hash Join (cost=2.22..22.84 rows=5 width=55) (actual time=1.492..1.605 rows=4 loops=1)"
" Hash Cond: (i."ImtypId" = it."Id")"
" -> Nested Loop (cost=0.84..21.29 rows=34 width=51) (actual time=1.408..1.507 rows=30 loops=1)"
" -> Nested Loop (cost=0.56..9.68 rows=34 width=35) (actual time=1.038..1.053 rows=30 loops=1)"
" -> Index Only Scan using ev_query on "table_Ev" e (cost=0.28..4.29 rows=1 width=31) (actual time=0.523..0.523 rows=1 loops=1)"
" Index Cond: ("Id" = 1301)"
" Heap Fetches: 0"
" -> Index Only Scan using asmitm_query on "table_AsmItm" ai (cost=0.28..5.07 rows=31 width=8) (actual time=0.499..0.508 rows=30 loops=1)"
" Index Cond: (("AsmId" = e."AsmId") AND ("IsActive" = true))"
" Filter: "IsActive""
" Heap Fetches: 0"
" -> Index Only Scan using itm_query on "table_Itm" i (cost=0.28..0.33 rows=1 width=16) (actual time=0.014..0.014 rows=1 loops=30)"
" Index Cond: ("Id" = ai."ImId")"
" Heap Fetches: 0"
" -> Hash (cost=1.33..1.33 rows=4 width=12) (actual time=0.026..0.026 rows=4 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 9kB"
" -> Seq Scan on "ItmTyp" it (cost=0.00..1.33 rows=4 width=12) (actual time=0.013..0.018 rows=4 loops=1)"
" Filter: ("ParentId" = 12)"
" Rows Removed by Filter: 22"
" -> Index Only Scan using jur_query on "table_Jur" j (cost=0.14..0.15 rows=1 width=36) (actual time=0.065..0.066 rows=1 loops=4)"
" Index Cond: ("Id" = i."JurId")"
" Heap Fetches: 4"
" -> Index Scan using pwsres_evid_ImId_canid_query on "table_PwsRes" p (cost=0.42..3.78 rows=72 width=92) (actual time=0.056..6.562 rows=2109 loops=4)"
" Index Cond: (("EvId" = 1301) AND ("ImId" = i."Id"))"
" -> Index Only Scan using user_query on "table_User" u (cost=0.42..0.57 rows=1 width=16) (actual time=0.002..0.002 rows=1 loops=8436)"
" Index Cond: ("Id" = p."CanId")"
" Heap Fetches: 0"
" -> Index Only Scan using ins_query on "table_Ins" ins (cost=0.14..0.16 rows=1 width=31) (actual time=0.001..0.001 rows=1 loops=8436)"
" Index Cond: ("Id" = u."InsId")"
" Heap Fetches: 0"
" -> Index Scan using "IX_ItmCont_ImId" on "table_ItmCont" ic (cost=0.29..0.81 rows=1 width=392) (actual time=0.002..0.002 rows=1 loops=8436)"
" Index Cond: ("ImId" = p."ImId")"
" Filter: ("ContTyp" = 'CP'::text)"
" Rows Removed by Filter: 1"
" -> Index Scan using "IX_FreDetail_FreId" on "table_FreDetail" f (cost=0.14..0.18 rows=2 width=22) (actual time=0.000..0.001 rows=1 loops=8436)"
" Index Cond: ("FreId" = p."FreId")"
" -> Index Scan using "IX_DurDetail_DurId" on "table_DurDetail" d (cost=0.14..0.17 rows=2 width=7) (actual time=0.000..0.000 rows=0 loops=12340)"
" Index Cond: ("DurId" = p."DurId")"
" -> Index Scan using "IX_DruConsRouteDetail_DruConsRouId" on "table_DruConsRouDetail" dr (cost=0.14..0.18 rows=2 width=21) (actual time=0.001..0.001 rows=1 loops=14325)"
" Index Cond: ("DruConsRouteId" = p."RouteId")"
" SubPlan 1"
" -> Index Only Scan using asm_query on "table_Asm" (cost=0.14..8.16 rows=1 width=26) (actual time=0.001..0.001 rows=1 loops=24118)"
" Index Cond: ("Id" = e."AsmId")"
" Heap Fetches: 24118"
" SubPlan 2"
" -> Seq Scan on "ItmTyp" ity (cost=0.00..1.33 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=24118)"
" Filter: ("Id" = it."ParentId")"
" Rows Removed by Filter: 25"
"Planning time: 47.056 ms"
"Execution time: 378.229 ms"
If the explain analyze output is taking 378ms, that is how long the query is taking and there's probably not a lot of room for improvement there. If it's taking 1 minute to transfer and load the data, you need to work on that end.
If you're trying to view very wide rows in psql or pgadmin, it can take some time to calculate the row widths or render the html, but that has nothing to do with query performance.