Postgres: order of index - postgresql

Does order of index (ASC or DESC) impacts the SELECT with ORDER BY in opposite direction?
For example, suppose we have a table:
CREATE TABLE public."Comments"
(
id integer NOT NULL DEFAULT nextval('"Comments_id_seq"'::regclass),
child_count integer DEFAULT 0,
comment text COLLATE pg_catalog."default",
CONSTRAINT "Comments_pkey" PRIMARY KEY (id)
)
And index on child_count column:
CREATE INDEX child_count
ON public."Comments" USING btree
(child_count)
TABLESPACE pg_default;
(I.e., in ASC order by default.)
Than we run a statement:
SELECT id, child_count, comment
FROM public."Comments"
ORDER BY child_count DESC
OFFSET 0
LIMIT 100
Do we need to invert index direction?

You can adjust the ordering of a by including the options ASC, DESC, NULLS FIRST, and/or NULLS LAST when creating the index; for example:
CREATE INDEX test2_info_nulls_low ON test2 (info NULLS FIRST);
CREATE INDEX test3_desc_index ON test3 (id DESC NULLS LAST);
via https://www.postgresql.org/docs/current/static/indexes-ordering.html

Related

Simple POSTGRESQL SELECT query too slow

I have a table that stores logs from an Electronic Invoicing System webservice, this is my SQL Structure
CREATE TABLE public.eis_transactions
(
id bigint NOT NULL DEFAULT nextval('eis_transactions_id_seq'::regclass),
operation_type character varying COLLATE pg_catalog."default",
sale_id integer,
delivery_note_id integer,
sale_credit_note_id integer,
debit_note_id integer,
cdc text COLLATE pg_catalog."default",
transaction_id text COLLATE pg_catalog."default",
response_code character varying COLLATE pg_catalog."default",
response_description text COLLATE pg_catalog."default",
xml text COLLATE pg_catalog."default",
response_xml text COLLATE pg_catalog."default",
response_datetime timestamp without time zone,
created timestamp without time zone,
modified timestamp without time zone,
user_id integer,
async boolean DEFAULT false,
url character varying COLLATE pg_catalog."default",
final_xml text COLLATE pg_catalog."default",
CONSTRAINT eis_transactions_pkey PRIMARY KEY (id),
CONSTRAINT eis_transactions_debit_note_id_fkey FOREIGN KEY (debit_note_id)
REFERENCES public.debit_notes (id) MATCH SIMPLE
ON UPDATE RESTRICT
ON DELETE RESTRICT,
CONSTRAINT eis_transactions_delivery_note_id_fkey FOREIGN KEY (delivery_note_id)
REFERENCES public.delivery_notes (id) MATCH SIMPLE
ON UPDATE RESTRICT
ON DELETE RESTRICT,
CONSTRAINT eis_transactions_sale_credit_note_id_fkey FOREIGN KEY (sale_credit_note_id)
REFERENCES public.sale_credit_notes (id) MATCH SIMPLE
ON UPDATE RESTRICT
ON DELETE RESTRICT,
CONSTRAINT eis_transactions_sale_id_fkey FOREIGN KEY (sale_id)
REFERENCES public.sales (id) MATCH SIMPLE
ON UPDATE RESTRICT
ON DELETE RESTRICT,
CONSTRAINT eis_transactions_user_id_fkey FOREIGN KEY (user_id)
REFERENCES public.users (id) MATCH SIMPLE
ON UPDATE RESTRICT
ON DELETE RESTRICT
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
ALTER TABLE public.eis_transactions
OWNER to postgres;
-- Index: eis_transactions_id_idx
-- DROP INDEX public.eis_transactions_id_idx;
CREATE INDEX eis_transactions_id_idx
ON public.eis_transactions USING btree
(id ASC NULLS LAST)
TABLESPACE pg_default;
-- Index: eis_transactions_id_idx1
-- DROP INDEX public.eis_transactions_id_idx1;
CREATE INDEX eis_transactions_id_idx1
ON public.eis_transactions USING btree
(id ASC NULLS FIRST)
TABLESPACE pg_default;
-- Index: eis_transactions_id_idx2
-- DROP INDEX public.eis_transactions_id_idx2;
CREATE INDEX eis_transactions_id_idx2
ON public.eis_transactions USING btree
(id DESC NULLS FIRST)
TABLESPACE pg_default;
-- Index: eis_transactions_sale_id_delivery_note_id_sale_credit_note__idx
-- DROP INDEX public.eis_transactions_sale_id_delivery_note_id_sale_credit_note__idx;
CREATE INDEX eis_transactions_sale_id_delivery_note_id_sale_credit_note__idx
ON public.eis_transactions USING btree
(sale_id ASC NULLS LAST, delivery_note_id ASC NULLS LAST, sale_credit_note_id ASC NULLS LAST, debit_note_id ASC NULLS LAST, user_id ASC NULLS LAST)
TABLESPACE pg_default;
Cointains ~800 rows, this is the query:
SELECT * FROM eis_transactions LIMIT 1000;
It takes more than 60 seconds to complete the query.
And this is the EXPLAIN ANALYZE result i got:
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM eis_transactions LIMIT 100;
Limit (cost=0.00..15.94 rows=100 width=1108) (actual time=0.013..0.121 rows=100 loops=1)
Buffers: shared read=15
-> Seq Scan on eis_transactions (cost=0.00..128.03 rows=803 width=1108) (actual time=0.012..0.106 rows=100 loops=1)
Buffers: shared read=15
Total runtime: 0.180 ms
But doing a SELECT * FROM eis_transactions (With or without LIMIT) will take more than 60 seconds. While i have other tables with more than 1000 and they don't take so long as this particular table.
What could be wrong ?
Thank you !

Syntax error on Upsert PostgreSql while usin an insert into with on conflict [duplicate]

I'm getting the following error when doing the following type of insert:
Query:
INSERT INTO accounts (type, person_id) VALUES ('PersonAccount', 1) ON
CONFLICT (type, person_id) WHERE type = 'PersonAccount' DO UPDATE SET
updated_at = EXCLUDED.updated_at RETURNING *
Error:
SQL execution failed (Reason: ERROR: there is no unique or exclusion
constraint matching the ON CONFLICT specification)
I also have an unique INDEX:
CREATE UNIQUE INDEX uniq_person_accounts ON accounts USING btree (type,
person_id) WHERE ((type)::text = 'PersonAccount'::text);
The thing is that sometimes it works, but not every time. I randomly get
that exception, which is really strange. It seems that it can't access that
INDEX or it doesn't know it exists.
Any suggestion?
I'm using PostgreSQL 9.5.5.
Example while executing the code that tries to find or create an account:
INSERT INTO accounts (type, person_id, created_at, updated_at) VALUES ('PersonAccount', 69559, '2017-02-03 12:09:27.259', '2017-02-03 12:09:27.259') ON CONFLICT (type, person_id) WHERE type = 'PersonAccount' DO UPDATE SET updated_at = EXCLUDED.updated_at RETURNING *
SQL execution failed (Reason: ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification)
In this case, I'm sure that the account does not exist. Furthermore, it never outputs the error when the person has already an account. The problem is that, in some cases, it also works if there is no account yet. The query is exactly the same.
Per the docs,
All table_name unique indexes that, without regard to order, contain exactly the
conflict_target-specified columns/expressions are inferred (chosen) as arbiter
indexes. If an index_predicate is specified, it must, as a further requirement
for inference, satisfy arbiter indexes.
The docs go on to say,
[index_predicate are u]sed to allow inference of partial unique indexes
In an understated way, the docs are saying that when using a partial index and
upserting with ON CONFLICT, the index_predicate must be specified. It is not
inferred for you. I learned this
here, and the following example demonstrates this.
CREATE TABLE test.accounts (
id int PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
type text,
person_id int);
CREATE UNIQUE INDEX accounts_note_idx on accounts (type, person_id) WHERE ((type)::text = 'PersonAccount'::text);
INSERT INTO test.accounts (type, person_id) VALUES ('PersonAccount', 10);
so that we have:
unutbu=# select * from test.accounts;
+----+---------------+-----------+
| id | type | person_id |
+----+---------------+-----------+
| 1 | PersonAccount | 10 |
+----+---------------+-----------+
(1 row)
Without index_predicate we get an error:
INSERT INTO test.accounts (type, person_id) VALUES ('PersonAccount', 10) ON CONFLICT (type, person_id) DO NOTHING;
-- ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
But if instead you include the index_predicate, WHERE ((type)::text = 'PersonAccount'::text):
INSERT INTO test.accounts (type, person_id) VALUES ('PersonAccount', 10)
ON CONFLICT (type, person_id)
WHERE ((type)::text = 'PersonAccount'::text) DO NOTHING;
then there is no error and DO NOTHING is honored.
A simple solution of this error
First of all let's see the cause of error with a simple example. Here is the table mapping products to categories.
create table if not exists product_categories (
product_id uuid references products(product_id) not null,
category_id uuid references categories(category_id) not null,
whitelist boolean default false
);
If we use this query:
INSERT INTO product_categories (product_id, category_id, whitelist)
VALUES ('123...', '456...', TRUE)
ON CONFLICT (product_id, category_id)
DO UPDATE SET whitelist=EXCLUDED.whitelist;
This will give you error No unique or exclusion constraint matching the ON CONFLICT because there is no unique constraint on product_id and category_id. There could be multiple rows having the same combination of product and category id (so there can never be a conflict on them).
Solution:
Use unique constraint on both product_id and category_id like this:
create table if not exists product_categories (
product_id uuid references products(product_id) not null,
category_id uuid references categories(category_id) not null,
whitelist boolean default false,
primary key(product_id, category_id) -- This will solve the problem
-- unique(product_id, category_id) -- OR this if you already have a primary key
);
Now you can use ON CONFLICT (product_id, category_id) for both columns without any error.
In short: Whatever column(s) you use with on conflict, they should have unique constraint.
The easy way to fix it is by setting the conflicting column as UNIQUE
I did not have a chance to play with UPSERT, but I think you have a case from
docs:
Note that this means a non-partial unique index (a unique index
without a predicate) will be inferred (and thus used by ON CONFLICT)
if such an index satisfying every other criteria is available. If an
attempt at inference is unsuccessful, an error is raised.
I solved the same issue by creating one UNIQUE INDEX for ALL columns you want to include in the ON CONFLICT clause, not one UNIQUE INDEX for each of the columns.
CREATE TABLE table_name (
element_id UUID NOT NULL DEFAULT gen_random_uuid(),
timestamp TIMESTAMP NOT NULL DEFAULT now():::TIMESTAMP,
col1 UUID NOT NULL,
col2 STRING NOT NULL ,
col3 STRING NOT NULL ,
CONSTRAINT "primary" PRIMARY KEY (element_id ASC),
UNIQUE (col1 asc, col2 asc, col3 asc)
);
Which will allow to query like
INSERT INTO table_name (timestamp, col1, col2, col3) VALUES ('timestamp', 'uuid', 'string', 'string')
ON CONFLICT (col1, col2, col3)
DO UPDATE timestamp = EXCLUDED.timestamp, col1 = EXCLUDED.col1, col2 = excluded.col2, col3 = col3.excluded;

PostgreSQL query does not use index

Table definition is as follows:
CREATE TABLE public.the_table
(
id integer NOT NULL DEFAULT nextval('the_table_id_seq'::regclass),
report_timestamp timestamp without time zone NOT NULL,
value_id integer NOT NULL,
text_value character varying(255),
numeric_value double precision,
bool_value boolean,
dt_value timestamp with time zone,
exported boolean NOT NULL DEFAULT false,
CONSTRAINT the_table_fkey_valdef FOREIGN KEY (value_id)
REFERENCES public.value_defs (value_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE RESTRICT
)
WITH (
OIDS=FALSE
);
ALTER TABLE public.the_table
OWNER TO postgres;
Indices:
CREATE INDEX the_table_idx_id ON public.the_table USING brin (id);
CREATE INDEX the_table_idx_timestamp ON public.the_table USING btree (report_timestamp);
CREATE INDEX the_table_idx_tsvid ON public.the_table USING brin (report_timestamp, value_id);
CREATE INDEX the_table_idx_valueid ON public.the_table USING btree (value_id);
The query is:
SELECT * FROM the_table r WHERE r.value_id = 1064 ORDER BY r.report_timestamp desc LIMIT 1;
While running the query PostgreSQL does not use the_table_idx_valueid index.
Why?
If anything, this index will help:
CREATE INDEX ON the_table (value_id, report_timestamp);
Depending on the selectivity of the condition and the number of rows in the table, PostgreSQL may correctly deduce that a sequential scan and a sort is faster than an index scan.

postgresql simple select is slow

i have a table:
CREATE TABLE my_table
(
id integer NOT NULL DEFAULT nextval('seq_my_table_id'::regclass),
fk_id1 integer NOT NULL,
fk_id2 smallint NOT NULL,
name character varying(255) NOT NULL,
description text,
currency_name character varying(3) NOT NULL,
created timestamp with time zone NOT NULL DEFAULT now(),
updated timestamp with time zone NOT NULL DEFAULT now(),
CONSTRAINT "PK_my_table_id" PRIMARY KEY (id ),
CONSTRAINT "FK_my_table_fk_id1" FOREIGN KEY (fk_id1)
REFERENCES my_table2 (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION DEFERRABLE INITIALLY DEFERRED,
CONSTRAINT "FK_my_table_fk_id2" FOREIGN KEY (fk_id2)
REFERENCES my_table3 (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION DEFERRABLE INITIALLY DEFERRED
)
WITH (
OIDS=FALSE,
autovacuum_enabled=true,
autovacuum_vacuum_threshold=50,
autovacuum_vacuum_scale_factor=0.2,
autovacuum_analyze_threshold=50,
autovacuum_analyze_scale_factor=0.1,
autovacuum_vacuum_cost_delay=20,
autovacuum_vacuum_cost_limit=200,
autovacuum_freeze_min_age=50000000,
autovacuum_freeze_max_age=200000000,
autovacuum_freeze_table_age=150000000
);
ALTER TABLE my_table
OWNER TO postgres;
CREATE INDEX my_table_fk_id1
ON my_table
USING btree
(fk_id1 );
CREATE INDEX my_table_fk_id2
ON my_table
USING btree
(fk_id2 );
tables records count
select count(id) from my_table; --24061
select count(id) from my_table2; --24061
select count(id) from my_table3; --123
execution time
select * from my_table -- ~17sec
vacuum/analyze - no effect
description - length ~ 4000 chars in each row
postgres.conf - standart settings
Version: 9.1
select all fields except description reduce execution time to ~1,5 sec
How to icrease select speed with description ?
upd
--explain analyze select * from my_table
"Seq Scan on my_table (cost=0.00..3425.79 rows=24079 width=1015) (actual time=0.019..17.238 rows=24079 loops=1)"
"Total runtime: 18.649 ms"
The question is how to make this fast. The issue is not on the server since it takes 18ms there. The simple solution is to select fewer columns so that there is less to transfer over the network. My guess is that you have long descriptions on some. Leave that column off your select and try again.

Really long - running query when using order by

I have a major issue with one of my queries:
SELECT tpostime, gispoint
FROM mytable
WHERE idterminal = 233463
ORDER BY idpos DESC
When idterminal does not exist in 'mytable' then this query is being processed forever, and then I'm presented with timeout (well 'canceling statement due to user request' message to be specific), but when I remove the order by clause, everything seems fine. Now I'm wondering - idpos is primary key for 'mytable', therefore it's indexed so ordering by it should be fast, I guess.
And what's important - 'mytable' weights 3gb.
Table and index definitions:
CREATE TABLE mytable (
idpos serial NOT NULL,
tpostime timestamp(0) without time zone,
idterminal integer DEFAULT 0,
gispoint geometry,
idtracks integer,
CONSTRAINT mytable_pkey PRIMARY KEY (idpos),
CONSTRAINT qwe FOREIGN KEY (idtracks) REFERENCES qwe (idtracks)
MATCH SIMPLE ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT abc FOREIGN KEY (idterminal) REFERENCES abc (idterminal)
MATCH SIMPLE ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT enforce_geotype_gispoint
CHECK (geometrytype(gispoint)= 'POINT'::text OR gispoint IS NULL),
CONSTRAINT enforce_srid_gispoint CHECK (srid(gispoint) = 4326)
) WITH OIDS;
CREATE INDEX idx_idterminal ON mytable USING btree (idterminal);
CREATE INDEX idx_idtracks ON mytable USING btree (idtracks);
CREATE INDEX idx_idtracks_idterminal ON mytable USING btree (idtracks, idterminal);
It looks to me like the selectivity of idterminal is low enough for postgres to choose a full scan of mytable_pkey rather than the cost of ordering all the rows with idterminal = 233463
I suggest:
CREATE INDEX idx_idterminal2 ON mytable USING btree (idterminal, idpos);
and perhaps:
DROP INDEX idx_idterminal;
You don't mention if this is a production database or not - if it is of course you will need to test the impact of the change first elsewhere.
If you prefer not to change the schema you might like to try and trick the optimizer into the path you know is best with something like (not tested) for 8.4 and above:
SELECT *
FROM ( SELECT tpostime, gispoint, idpos, row_number() over (order by 1)
FROM mytable
WHERE idterminal = 233463 )
ORDER BY idpos DESC;
or perhaps just:
SELECT *
FROM ( SELECT tpostime, gispoint, idpos
FROM mytable
WHERE idterminal = 233463
GROUP BY tpostime, gispoint, idpos )
ORDER BY idpos DESC;
or even:
SELECT tpostime, gispoint
FROM mytable
WHERE idterminal = 233463
ORDER BY idpos*2 DESC
Do you have an index on idterminal? Try adding a composite index with both (idpos, idterminal). What is probably happening if you do the explain plan, is it is ordering by idpos first, then scanning to find idterminal.