Indexes on join tables - postgresql

When searching on Google for join table indexes, I got this question.
Now, I believe that it is giving some false information in the accepted answer, or I do not understand how everything works.
Given the following tables (running on PostGreSQL 9.4):
CREATE TABLE "albums" ("album_id" serial PRIMARY KEY, "album_name" text)
CREATE TABLE "artists" ("artist_id" serial PRIMARY KEY, "artist_name" text)
CREATE TABLE "albums_artists" ("album_id" integer REFERENCES "albums", "artist_id" integer REFERENCES "artists")
I was trying to replicate the scenario from the question mentioned above, by creating first an index on both of the columns of the albums_artists table and then one index for each column (without keeping the index on both columns).
I would have been expecting very different results when using the EXPLAIN command for a normal, traditional select like the following one:
SELECT "artists".* FROM "test"."artists"
INNER JOIN "test"."albums_artists" ON ("albums_artists"."artist_id" = "artists"."artist_id")
WHERE ("albums_artists"."album_id" = 1)
However, when actually running explain on it, I get exactly the same result for each of the cases (with one index on each column vs. one index on both columns).
I've been reading the documentation on PostGreSQL about indexing and it doesn't make any sense on the results that I am getting:
Hash Join (cost=15.05..42.07 rows=11 width=36) (actual time=0.024..0.025 rows=1 loops=1)
Hash Cond: (artists.artist_id = albums_artists.artist_id)
-> Seq Scan on artists (cost=0.00..22.30 rows=1230 width=36) (actual time=0.006..0.006 rows=1 loops=1)
-> Hash (cost=14.91..14.91 rows=11 width=4) (actual time=0.009..0.009 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Bitmap Heap Scan on albums_artists (cost=4.24..14.91 rows=11 width=4) (actual time=0.008..0.009 rows=1 loops=1)
Recheck Cond: (album_id = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on albums_artists_album_id_index (cost=0.00..4.24 rows=11 width=0) (actual time=0.005..0.005 rows=1 loops=1)
Index Cond: (album_id = 1)
I would expect to not get an Index scan at the last step when using an index composed by 2 different columns (since I am only using one of them in the WHERE clause).
I was about to open a bug in an ORM library that adds one index for both columns for join tables, but now I am not so sure. Can anyone help me understand why is the behavior similar in the two cases and what would actually be the difference, if there is any?

add a NOT NULL constraint on the key columns (a tuple with NULLs would make no sense here)
add a PRIMARY KEY (forcing a UNIQUE index on the two keyfields)
As a suport for FK lookups : add a compound index for the PK fields in reversed order
after creating/adding PKs and indexes, you may want to ANALYZE the table (only key columns have statistics)
CREATE TABLE albums_artists
( album_id integer NOT NULL REFERENCES albums (album_id)
, artist_id integer NOT NULL REFERENCES artists (artist_id)
, PRIMARY KEY (album_id, artist_id)
);
CREATE UNIQUE INDEX ON albums_artists (artist_id, album_id);
The reason behind the observed behaviour is the fact that the planner/optimiser is information based, driven by heuristics. Without any information about the fraction of rows that will actually be needed given the conditions, or the fraction of rows that actually maches (in case of a JOIN), the planner makes a guess: (for example: 10% for a range query). For a small query, a hash join will always be a winning scenario, it does imply fetching all tuples from both tables, but the join itself is very efficient.
For columns that are part of a key or index, statistics will be collected, so the planner can make more realistic estimates about the amount of rows involved. Ald that will often result in an indexed plan, since that could need fewer pages to be fetched.
Foreign keys are a very special case; since the planner will know that all the values from the referring table will be present in the referred table. (that is 100%, assuming NOT NULL)

Related

Postgres JSONB Retrieval is very slow

At bit of a loss here. First and foremost, I'm not a dba nor do I really have any experience with Postgres with the exception of what I'm doing now.
Postgres seems to choke when you want to return jsonb documents in general for anything more than a couple of hundred rows. When you try to return thousands, query performance becomes terrible. If you go even further and attempt to return multiple jsonb documents following various table joins, forget it.
Here is my scenario with some code:
I have 3 tables - all tables have jsonb models, all complex models and 2 of which are sizeable (8 to 12kb in size uncompressed). In this particular operation I need to unnest a jsonb array of elements to then work through - this gives me roughly 12k records.
Each record then contains an ID that I use to join another important table - I need to retreive the jsonb doc from this table. From there, I need to join that table on to another (much smaller) table and also pull the doc from there based on another key.
The output is therefore several columns + 3 jsonb documents ranging from <1kb in size to around 12kb uncompressed in size.
Query data retrieval is effectively pointless - I've yet to see the query return data. As soon as I strip away the json doc columns, naturally the query speeds up to seconds or less. 1 jsonb document bumps the retrieval to 40seconds in my case, adding a second takes us to 2 minutes and adding the third is much longer.
What am I doing wrong? Is there any way to retrieve the jsonb documents in a performant way?
SELECT x.id,
a.doc1,
b.doc2,
c.doc3
FROM ( SELECT id,
(elements.elem ->> 'a'::text)::integer AS a,
(elements.elem ->> 'b'::text)::integer AS b,
(elements.elem ->> 'c'::text)::integer AS c,
(elements.elem ->> 'd'::text)::integer AS d,
(elements.elem ->> 'e'::text)::integer AS e
FROM tab
CROSS JOIN LATERAL jsonb_array_elements(tab.doc -> 'arrayList'::text) WITH ORDINALITY elements(elem, ordinality)) x
LEFT JOIN table2 a ON x.id = a.id
LEFT JOIN table3 b ON a.other_id = b.id
LEFT JOIN table4 c ON b.other_id = c.id;
The tables themselves are fairly standard:
CREATE TABLE a (
id (primary key),
other_id (foreign key),
doc jsonb
)
Nothing special about these tables, they are ids and jsonb documents
A note - we are using Postgres for a few reasons, we do need the relational aspects of PG but at the same time we need to document storage and retrieval ability for later in our workflow.
Apologies if I've not provided enough data here, I can try to add some more based on any comments
EDIT: added explain:
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=465.79..96225.93 rows=11300 width=1843)
Hash Cond: (pr.table_3_id = br.id)
-> Hash Left Join (cost=451.25..95756.86 rows=11300 width=1149)
Hash Cond: (((p.doc ->> 'secondary_id'::text))::integer = pr.id)
-> Nested Loop Left Join (cost=0.44..95272.14 rows=11300 width=1029)
-> Nested Loop (cost=0.01..239.13 rows=11300 width=40)
-> Seq Scan on table_3 (cost=0.00..13.13 rows=113 width=710)
-> Function Scan on jsonb_array_elements elements (cost=0.01..1.00 rows=100 width=32)
-> Index Scan using table_1_pkey on table_1 p (cost=0.43..8.41 rows=1 width=993)
Index Cond: (((elements.elem ->> 'field_id'::text))::integer = id)
-> Hash (cost=325.36..325.36 rows=10036 width=124)
-> Seq Scan on table_2 pr (cost=0.00..325.36 rows=10036 width=124)
-> Hash (cost=13.13..13.13 rows=113 width=710)
-> Seq Scan on table_3 br (cost=0.00..13.13 rows=113 width=710)
(14 rows)
EDIT2: Sorry been mega busy - I will try to go into more detail - firstly the fully explain plan (I didn't know about the additional parameters) - Ill leave in the actual tables (I wasn't sure if I was allowed to):
Hash Left Join (cost=465.79..96225.93 rows=11300 width=1726) (actual time=4.669..278.781 rows=12522 loops=1)
Hash Cond: (pr.brand_id = br.id)
Buffers: shared hit=64813
-> Hash Left Join (cost=451.25..95756.86 rows=11300 width=1032) (actual time=4.537..265.749 rows=12522 loops=1)
Hash Cond: (((p.doc ->> 'productId'::text))::integer = pr.id)
Buffers: shared hit=64801
-> Nested Loop Left Join (cost=0.44..95272.14 rows=11300 width=912) (actual time=0.240..39.480 rows=12522 loops=1)
Buffers: shared hit=49964
-> Nested Loop (cost=0.01..239.13 rows=11300 width=40) (actual time=0.230..8.177 rows=12522 loops=1)
Buffers: shared hit=163
-> Seq Scan on brand (cost=0.00..13.13 rows=113 width=710) (actual time=0.003..0.038 rows=113 loops=1)
Buffers: shared hit=12
-> Function Scan on jsonb_array_elements elements (cost=0.01..1.00 rows=100 width=32) (actual time=0.045..0.057 rows=111 loops=113)
Buffers: shared hit=151
-> Index Scan using product_variant_pkey on product_variant p (cost=0.43..8.41 rows=1 width=876) (actual time=0.002..0.002 rows=1 loops=12522)
Index Cond: (((elements.elem ->> 'productVariantId'::text))::integer = id)
Buffers: shared hit=49801
-> Hash (cost=325.36..325.36 rows=10036 width=124) (actual time=4.174..4.174 rows=10036 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 1684kB
Buffers: shared hit=225
-> Seq Scan on product pr (cost=0.00..325.36 rows=10036 width=124) (actual time=0.003..1.836 rows=10036 loops=1)
Buffers: shared hit=225
-> Hash (cost=13.13..13.13 rows=113 width=710) (actual time=0.114..0.114 rows=113 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 90kB
Buffers: shared hit=12
-> Seq Scan on brand br (cost=0.00..13.13 rows=113 width=710) (actual time=0.003..0.043 rows=113 loops=1)
Buffers: shared hit=12
Planning Time: 0.731 ms
Execution Time: 279.952 ms
(29 rows)
Your query is hard to follow for a few reasons:
Your tables are named tab, table2, table3, `table4.
Your subquery parses JSON for every single row in the table, projects out some values, and then the outer query never uses those values. The only one that appears to be relevant is id.
Outer joins must be executed in order while inner joins can be freely re-arranged for performance. Without knowing the purpose of this query, it's impossible for me to determine if an outer join is appropriate or not.
The table names and column names in the execution plan do not match the query, so I'm not convinced this plan is accurate.
You did not supply a schema.
That said, I'll do my best.
Things that stand out performance-wise
No where clause
Since there is no where clause, your query will run jsonb_array_elements against every single row of tab, which is what is happening. Aside from extracting the data out of JSON and storing it into a separate column, I can't imagine much that could be done to optimize it.
Insufficient indexes to power the joins
The query plan suggests there might be a meaningful cost to the joins. Except for table1, each join is driven by a sequential scan of the tables, which means, reading every row of both tables. I suspect adding indexes on each table will help. It appears you are joining on id columns, so a simple primary key constraint will improve both your data integrity and query performance.
alter table tab add constraint primary key (id);
alter table table2 add constraint primary key (id);
alter table table3 add constraint primary key (id);
alter table table4 add constraint primary key (id);
Type conversions
This part of the execution plan shows a double type conversion in your first join:
Index Cond: (((elements.elem ->> 'field_id'::text))::integer = id)
This predicate means that the id value from tab is being converted to text, then the text converted to an integer so it can match against table2.id. These conversions can be expensive in compute time and it some cases, can prevent index usage. It's hard to advise on what to do because I don't know what the actual types are.

Why doesn't postresql use all columns in a multi-column index?

I am using the extension
CREATE EXTENSION btree_gin;
I have an index that looks like this...
create index boundaries2 on rets USING GIN(source, isonlastsync, status, (geoinfo::jsonb->'boundaries'), ctcvalidto, searchablePrice, ctcSortOrder);
before I started messing with it, the index looked like this, with the same results that I'm about to share, so it seems minor variations in the index definition don't make a difference:
create index boundaries on rets USING GIN((geoinfo::jsonb->'boundaries'), source, status, isonlastsync, ctcvalidto, searchablePrice, ctcSortOrder);
I give pgsql 11 this query:
explain analyze select id from rets where ((geoinfo::jsonb->'boundaries' ?| array['High School: Torrey Pines']) AND source='SDMLS'
AND searchablePrice>=800000 AND searchablePrice<=1200000 AND YrBlt>=2000 AND EstSF>=2300
AND Beds>=3 AND FB>=2 AND ctcSortOrder>'2019-07-05 16:02:54 UTC' AND Status IN ('ACTIVE','BACK ON MARKET')
AND ctcvalidto='9999-12-31 23:59:59 UTC' AND isonlastsync='true') order by LstDate desc, ctcSortOrder desc LIMIT 3000;
with result...
Limit (cost=120.06..120.06 rows=1 width=23) (actual time=472.849..472.850 rows=1 loops=1)
-> Sort (cost=120.06..120.06 rows=1 width=23) (actual time=472.847..472.848 rows=1 loops=1)
Sort Key: lstdate DESC, ctcsortorder DESC
Sort Method: quicksort Memory: 25kB
-> Bitmap Heap Scan on rets (cost=116.00..120.05 rows=1 width=23) (actual time=472.748..472.841 rows=1 loops=1)
Recheck Cond: ((source = 'SDMLS'::text) AND (((geoinfo)::jsonb -> 'boundaries'::text) ?| '{"High School: Torrey Pines"}'::text[]) AND (ctcvalidto = '9999-12-31 23:59:59+00'::timestamp with time zone) AND (searchableprice >= 800000) AND (searchableprice <= 1200000) AND (ctcsortorder > '2019-07-05 16:02:54+00'::timestamp with time zone))
Rows Removed by Index Recheck: 93
Filter: (isonlastsync AND (yrblt >= 2000) AND (estsf >= 2300) AND (beds >= 3) AND (fb >= 2) AND (status = ANY ('{ACTIVE,"BACK ON MARKET"}'::text[])))
Rows Removed by Filter: 10
Heap Blocks: exact=102
-> Bitmap Index Scan on boundaries2 (cost=0.00..116.00 rows=1 width=0) (actual time=471.762..471.762 rows=104 loops=1)
Index Cond: ((source = 'SDMLS'::text) AND (((geoinfo)::jsonb -> 'boundaries'::text) ?| '{"High School: Torrey Pines"}'::text[]) AND (ctcvalidto = '9999-12-31 23:59:59+00'::timestamp with time zone) AND (searchableprice >= 800000) AND (searchableprice <= 1200000) AND (ctcsortorder > '2019-07-05 16:02:54+00'::timestamp with time zone))
Planning Time: 0.333 ms
Execution Time: 474.311 ms
(14 rows)
The Question
Why are the columns status and isonlastsync not used by the Bitmap Index Scan on boundaries2?
It can do so if it predicts that filtering out those columns will be faster. This is usually the case if cardinality on columns is very low and you will fetch large enough portion of all rows; this is true for boolean like isonlastsync and usually true for status columns with just a few distinct values.
Rows Removed by Filter: 10 this is very little to filter out, either because your table does not hold large number of rows or most of them fit into condition you specified for those two columns. You might try generating more data in that table or selecting rows with rare status.
I suggest doing partial indexes (with WHERE condition), at least for boolean value and remove those two columns to make this index a bit more lightweight.
I cannot tell you why, but I can help you optimize the query.
You should not use a multi-column GIN index, but a GIN index on only the jsonb expression and a b-tree index on the other columns.
The order of columns matters: put the oned used in an equality condition first, with the most selective in the beginning. Next put the column with the must selective inequality or IN conditions. For the following columns, the order doesn't matter, as they will only act as filters in the index scan.
Make sure that the indexes are cached in RAM.
I'd expect you to be faster that way.
I think you're asking yourself the wrong question. As Lukasz answered already, PostgreSQL may find inefficient to check all columns in the index. The problem here is that your index is too big on disk.
Probably by trying to make this SQL faster you added as many columns as possible to the index, but it is backfiring.
The trick is to realize how much data PostgreSQL has to read to find your records. If your index contains too much data, it will have to read a lot. Also, be aware that low cardinality columns don't play well with BTree and common index types; generally you want to avoid indexing them.
To have an index as small as possible and it's quick to do lookups you have to find the column with more cardinality, or better, the column that will return less rows for your query. My guess is "ctcSortOrder". This will be the first column of your index.
Now look, after filtering by the 1st column, which column has now the most cardinality or will filter out most rows. I have no idea on your data, but "source" looks like a good candidate.
Try to avoid jsonb searches unless they are the primary source of the cardinality, and keep the index as a Btree. BTree is several times faster.
And like Lukasz suggested, look on partial indexes. For example add "WHERE Status IN ('ACTIVE','BACK ON MARKET') AND isonlastsync='true'" as these two may be common for all your searches.
Bottom line is, having a simpler, smaller index is faster than having all columns indexed. And the order of the columns does matter a lot. Stick with BTree unless there is a good reason (lots of cardinality in non-btree compatible types).
If your table is huge (>10M rows) consider table partitioning, for example by ctcSortOrder. But I don't think this is your case.

GIN index not used for small table when 0 rows returned

In a Postgres 9.4 database, I created a GIN trigram index on a table called 'persons' that contains 1514 rows like the following:
CREATE INDEX persons_index_name_1 ON persons
USING gin (lower(name) gin_trgm_ops);
and a query that looks for similar names as follows:
select name, last_name from persons where lower(name) % 'thename'
So, I first issued a query with a name I knew beforehand that would have similar matches, so the explain analyze showed that the index I created was used in this query:
select name, last_name from persons where lower(name) % 'george'
And the results were the expected:
-> Bitmap Heap Scan on persons (cost=52.01..58.72 rows=2 width=26) (actual time=0.054..0.065 rows=1 loops=1)
Recheck Cond: (lower((name)::text) % 'george'::text)
Rows Removed by Index Recheck: 2
Heap Blocks: exact=1
-> Bitmap Index Scan on persons_index_name_1 (cost=0.00..52.01 rows=2 width=0) (actual time=0.032..0.032 rows=3 loops=1)
Index Cond: (lower((name)::text) % 'george'::text)
...
Execution time: 1.382 ms"
So, out of curiosity, I wanted to see if the index was used when the thename parameter contained a name that didn't exist at all in the table:
select name, last_name from persons where lower(name) % 'noname'
But I saw that in this case that the index was not used at all and the execution time was way higher:
-> Seq Scan on persons (cost=0.00..63.72 rows=2 width=26) (actual time=6.494..6.494 rows=0 loops=1)
Filter: (lower((name)::text) % 'noname'::text)
Rows Removed by Filter: 1514
...
Execution time: 7.387 ms
As a test, I tried the same with a GIST index and in both cases, the index was used and the execution time was like the first case above.
I went ahead and recreated the table but this time inserting 10014 rows; and I saw that in both cases above, the GIN index was used and the execution time was the best for those cases.
Why is a GIN index is not used when the query above returns no results in a table with not so much rows (1514 in my case)?
Trigram indexes are case insensitive, test with:
select 'case' <-> 'CASE' AS ci1
, 'case' % 'CASE' AS ci2
, 'CASE' <-> 'CASE' AS c1
, 'CASE' % 'CASE' AS c2;
So you might as well just:
CREATE INDEX persons_index_name_1 ON persons USING gin (name gin_trgm_ops);
And:
select name, last_name from persons where name % 'thename';
As to your actual question, for small tables an index look-up might not pay. That's exactly what your added tests demonstrate. And establishing that nothing matches can be more expensive than finding some matches.
Aside from that, your cost setting and / or table statistics may not be at their respective optimum to let Postgres pick the most adequate query plans.
The expected cost numbers translate to much higher actual cost for the sequential scan than for the bitmap index scan. You may be overestimating the cost of index scans as compared to sequential scans. random_page_cost (and cpu_index_tuple_cost) may be set too high and effective_cache_size too low.
Keep PostgreSQL from sometimes choosing a bad query plan

How can I optimize this postgresql query?

Below is a postgres query that seems to be taking far longer than I would expect. The field_instances table is indexed on both form_instance_id and field_id, and the form_instances table is indexed on workflow_state. So I thought it would be a fast query, but it takes forever. Can anybody help me interpret the query plan and what kinds of indexes to add to speed it up? Thanks.
explain analyze
select form_id,form_instance_id,answer,field_id
from form_instances,field_instances
where workflow_state = 'DRqueued'
and form_instance_id = form_instances.id
and field_id = 'Book_EstimatedDueDate';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=8733.85..95692.90 rows=9277 width=29) (actual time=2550.000..15430.000 rows=11431 loops=1)
Hash Cond: (field_instances.form_instance_id = form_instances.id)
-> Bitmap Heap Scan on field_instances (cost=2681.11..89071.72 rows=47567 width=25) (actual time=850.000..13690.000 rows=51726 loops=1)
Recheck Cond: ((field_id)::text = 'Book_EstimatedDueDate'::text)
-> Bitmap Index Scan on index_field_instances_on_field_id (cost=0.00..2669.22 rows=47567 width=0) (actual time=830.000..830.000 rows=51729 loops=1)
Index Cond: ((field_id)::text = 'Book_EstimatedDueDate'::text)
-> Hash (cost=5911.34..5911.34 rows=11312 width=8) (actual time=1590.000..1590.000 rows=11431 loops=1)
-> Bitmap Heap Scan on form_instances (cost=511.94..5911.34 rows=11312 width=8) (actual time=720.000..1570.000 rows=11431 loops=1)
Recheck Cond: ((workflow_state)::text = 'DRqueued'::text)
-> Bitmap Index Scan on index_form_instances_on_workflow_state (cost=0.00..509.11 rows=11312 width=0) (actual time=650.000..650.000 rows=11509 loops=1)
Index Cond: ((workflow_state)::text = 'DRqueued'::text)
Total runtime: 15430.000 ms
(12 rows)
When you say The field_instances table is indexed on both form_instance_id and field_id you mean that there are separate indexes on form_instance_id and field_id on that table?
Try dropping the index on form_instance_id and put a concatenated index on (form_instance_id, field_id).
An index works by giving you a quick lookup that tells you where the rows are that match your index. It then has to fetch through those rows to do what you want. So you always want your index to be as specific as possible. If you put two indexes on the table, you'll have two different ways to do a lookup, but a query will usually only take advantage of one of them. If you put a concatenated index on the table, you'll be able to look up on the first field in the index, the first two fields, etc efficiently. (So a concatenated index on (a, b) gives you fast lookups on a, even faster lookups on both a and b, but doesn't help you look things up on b)
Right now it is figuring out all possible things in form_instances that have the right state. It separately figures out all of the field_instances that have the right field id. It then does a hash join. For this makes a lookup hash from one result set, and scans the other for matches.
With my suggestion it should figure out all possible form_instances of interest. It will then go to the index, and figure out all of the field_instances that match on both the form instance and field id, and then it will find exactly the results of interest. Because the index is more specific, the database will have fewer rows of data to deal with to process your query.
http://explain.depesz.com is a fantastic online tool that helps you identify the hot spots visually. I pasted your results into the tool and got this analysis: http://explain.depesz.com/s/VIk
It's hard to tell anything specifically without seeing your tables and indexes, however.
Need to know the data you have in your table however just from looking at the sql and column names I would recommend
do you really need an index on workflow_state assuming elements within it can't be very unique - this might not improve select but will insert or update...
try making field_id check the first condition in your where statement

Postgresql index on xpath expression gives no speed up

We are trying to create OEBS-analog functionality in Postgresql. Let's say we have a form constructor and need to store form results in database (e.g. email bodies). In Oracle you could use a table with 150~ columns (and some mapping stored elsewhere) to store each field in separate column. But in contrast to Oracle we would like to store all the form in postgresql xml field.
The example of the tree is
<row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<object_id>2</object_id>
<pack_form_id>23</pack_form_id>
<prod_form_id>34</prod_form_id>
</row>
We would like to search through this field.
Test table contains 400k rows and the following select executes in 90 seconds:
select *
from params
where (xpath('//prod_form_id/text()'::text, xmlvalue))[1]::text::int=34
So I created this index:
create index prod_form_idx
ON params using btree(
((xpath('//prod_form_id/text()'::text, xmlvalue))[1]::text::int)
);
And it made no difference. Still 90 seconds execution. EXPLAIN plan show this:
Bitmap Heap Scan on params (cost=40.29..6366.44 rows=2063 width=292)
Recheck Cond: ((((xpath('//prod_form_id/text()'::text, xmlvalue, '{}'::text[]))[1])::text)::integer = 34)
-> Bitmap Index Scan on prod_form_idx (cost=0.00..39.78 rows=2063 width=0)
Index Cond: ((((xpath('//prod_form_id/text()'::text, xmlvalue, '{}'::text[]))[1])::text)::integer = 34)
I am not the great plan interpreter so I suppose this means that index is being used. The question is: where's all the speed? And what can i do in order to optimize this kind of queries?
Well, at least the index is used. You get a bitmap index scan instead of a normal index scan though, which means the xpath() function will be called lots of times.
Let's do a little check :
CREATE TABLE foo ( id serial primary key, x xml, h hstore );
insert into foo (x,h) select XMLPARSE( CONTENT '<row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<object_id>2</object_id>
<pack_form_id>' || n || '</pack_form_id>
<prod_form_id>34</prod_form_id>
</row>' ),
('object_id=>2,prod_form_id=>34,pack_form_id=>'||n)::hstore
FROM generate_series( 1,100000 ) n;
test=> EXPLAIN ANALYZE SELECT count(*) FROM foo;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Aggregate (cost=4821.00..4821.01 rows=1 width=0) (actual time=24.694..24.694 rows=1 loops=1)
-> Seq Scan on foo (cost=0.00..4571.00 rows=100000 width=0) (actual time=0.006..13.996 rows=100000 loops=1)
Total runtime: 24.730 ms
test=> explain analyze select * from foo where (h->'pack_form_id')='123';
QUERY PLAN
----------------------------------------------------------------------------------------------------
Seq Scan on foo (cost=0.00..5571.00 rows=500 width=68) (actual time=0.075..48.763 rows=1 loops=1)
Filter: ((h -> 'pack_form_id'::text) = '123'::text)
Total runtime: 36.808 ms
test=> explain analyze select * from foo where ((xpath('//pack_form_id/text()'::text, x))[1]::text) = '123';
QUERY PLAN
------------------------------------------------------------------------------------------------------
Seq Scan on foo (cost=0.00..5071.00 rows=500 width=68) (actual time=4.271..3368.838 rows=1 loops=1)
Filter: (((xpath('//pack_form_id/text()'::text, x, '{}'::text[]))[1])::text = '123'::text)
Total runtime: 3368.865 ms
As we can see,
scanning the whole table with count(*) takes 25 ms
extracting one key/value from a hstore adds a small extra cost, about 0.12 µs/row
extracting one key/value from a xml using xpath adds a huge cost, about 33 µs/row
Conclusions :
xml is slow (but everyone knows that)
if you want to put a flexible key/value store in a column, use hstore
Also since your xml data is pretty big it will be toasted (compressed and stored out of the main table). This makes the rows in the main table much smaller, hence more rows per page, which reduces the efficiency of bitmap scans since all rows on a page have to be rechecked.
You can fix this though. For some reason the xpath() function (which is very slow, since it handles xml) has the same cost (1 unit) as say, the integer operator "+"...
update pg_proc set procost=1000 where proname='xpath';
You may need to tweak the cost value. When given the right info, the planner knows xpath is slow and will avoid a bitmap index scan, using an index scan instead, which doesn't need rechecking the condition for all rows on a page.
Note that this does not solve the row estimates problem. Since you can't ANALYZE the inside of the xml (or hstore) you get default estimates for the number of rows (here, 500). So, the planner may be completely wrong and choose a catastrophic plan if some joins are involved. The only solution to this is to use proper columns.