Query too slow for just 4 tables with 50000 rows each - postgresql
I've been struggling for hours and I can't find why this query takes too long (> 60 minutes). All 4 tables have less than 50.000 records.
Also if I remove any table (gel6, gf6 or ger6) the query takes less than 500 ms to execute. What am I doing wrong?
Explain plan:
https://explain.depesz.com/s/ldm2
SELECT COUNT(*)
FROM agroapp.ganado g
INNER JOIN (SELECT gel5.ganado_id, gel5.estado_leche
FROM agroapp.ganado_estado_leche gel5
INNER JOIN (SELECT MAX(gel3.ganado_estado_leche_id) ganado_estado_leche_id
FROM agroapp.ganado_estado_leche gel3
INNER JOIN (SELECT gel.ganado_id, MAX(gel.created) created
FROM agroapp.ganado_estado_leche gel
GROUP BY gel.ganado_id) gel2 ON (gel2.ganado_id = gel3.ganado_id AND gel2.created = gel3.created)
GROUP BY gel3.ganado_id) gel4 ON gel4.ganado_estado_leche_id = gel5.ganado_estado_leche_id
) gel6 ON gel6.ganado_id = g.ganado_id
INNER JOIN (SELECT gf5.ganado_id, gf5.fundo_id
FROM agroapp.ganado_fundo gf5
INNER JOIN (SELECT MAX(gf3.ganado_fundo_id) ganado_fundo_id
FROM agroapp.ganado_fundo gf3
INNER JOIN (SELECT gf.ganado_id, MAX(gf.created) created
FROM agroapp.ganado_fundo gf
GROUP BY gf.ganado_id) gf2 ON (gf2.ganado_id = gf3.ganado_id AND gf2.created = gf3.created)
GROUP BY gf3.ganado_id) gf4 ON gf4.ganado_fundo_id = gf5.ganado_fundo_id
) gf6 ON gf6.ganado_id = g.ganado_id
INNER JOIN (SELECT ger5.ganado_id, ger5.estado_reproductivo
FROM agroapp.ganado_estado_reproductivo ger5
INNER JOIN (SELECT MAX(ger3.ganado_estado_reproductivo_id) ganado_estado_reproductivo_id
FROM agroapp.ganado_estado_reproductivo ger3
INNER JOIN (SELECT ger.ganado_id, MAX(ger.created) created
FROM agroapp.ganado_estado_reproductivo ger
GROUP BY ger.ganado_id) ger2 ON (ger2.ganado_id = ger3.ganado_id AND ger2.created = ger3.created)
GROUP BY ger3.ganado_id) ger4 ON ger4.ganado_estado_reproductivo_id = ger5.ganado_estado_reproductivo_id
) ger6 ON ger6.ganado_id = g.ganado_id
WHERE g.organizacion_id = 21
Tables
CREATE TABLE agroapp.ganado_estado_leche
(
ganado_estado_leche_id serial NOT NULL,
organizacion_id integer NOT NULL,
isactive character(1) NOT NULL DEFAULT 'Y'::bpchar,
created timestamp without time zone NOT NULL DEFAULT now(),
createdby numeric(10,0) NOT NULL,
updated timestamp without time zone NOT NULL DEFAULT now(),
updatedby numeric(10,0) NOT NULL,
estado_leche character varying(80) NOT NULL,
ganado_id integer NOT NULL,
fecha_manejo timestamp without time zone NOT NULL,
CONSTRAINT ganado_estado_leche_pk PRIMARY KEY (ganado_estado_leche_id),
CONSTRAINT ganado_fk FOREIGN KEY (ganado_id)
REFERENCES agroapp.ganado (ganado_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
CREATE TABLE agroapp.ganado_fundo
(
ganado_fundo_id serial NOT NULL,
organizacion_id integer NOT NULL,
isactive character(1) NOT NULL DEFAULT 'Y'::bpchar,
created timestamp without time zone NOT NULL DEFAULT now(),
createdby numeric(10,0) NOT NULL,
updated timestamp without time zone NOT NULL DEFAULT now(),
updatedby numeric(10,0) NOT NULL,
fundo_id integer NOT NULL,
ganado_id integer NOT NULL,
CONSTRAINT ganado_fundo_pk PRIMARY KEY (ganado_fundo_id),
CONSTRAINT ganado_fk FOREIGN KEY (ganado_id)
REFERENCES agroapp.ganado (ganado_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
CREATE TABLE agroapp.ganado_estado_reproductivo
(
ganado_estado_reproductivo_id serial NOT NULL,
organizacion_id integer NOT NULL,
isactive character(1) NOT NULL DEFAULT 'Y'::bpchar,
created timestamp without time zone NOT NULL DEFAULT now(),
createdby numeric(10,0) NOT NULL,
updated timestamp without time zone NOT NULL DEFAULT now(),
updatedby numeric(10,0) NOT NULL,
estado_reproductivo character varying(80) NOT NULL,
ganado_id integer NOT NULL,
fecha_manejo timestamp without time zone NOT NULL,
CONSTRAINT ganado_estado_reproductivo_pk PRIMARY KEY (ganado_estado_reproductivo_id),
CONSTRAINT ganado_fk FOREIGN KEY (ganado_id)
REFERENCES agroapp.ganado (ganado_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
CREATE TABLE agroapp.ganado
(
ganado_id serial NOT NULL,
organizacion_id integer NOT NULL,
isactive character(1) NOT NULL DEFAULT 'Y'::bpchar,
created timestamp without time zone NOT NULL DEFAULT now(),
createdby numeric(10,0) NOT NULL,
updated timestamp without time zone NOT NULL DEFAULT now(),
updatedby numeric(10,0) NOT NULL,
fecha_nacimiento timestamp without time zone NOT NULL,
tipo_ganado character varying(80) NOT NULL,
diio_id integer NOT NULL,
fundo_id integer NOT NULL,
raza_id integer NOT NULL,
estado_reproductivo character varying(80) NOT NULL,
estado_leche character varying(80),
CONSTRAINT ganado_pk PRIMARY KEY (ganado_id),
CONSTRAINT diio_fk FOREIGN KEY (diio_id)
REFERENCES agroapp.diio (diio_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT fundo_fk FOREIGN KEY (fundo_id)
REFERENCES agroapp.fundo (fundo_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT raza_fk FOREIGN KEY (raza_id)
REFERENCES agroapp.raza (raza_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
Table design
This looks very much like a boolean column (yes / no):
isactive character(1) NOT NULL DEFAULT 'Y'::bpchar
If so, replace with:
isactive bool NOT NULL DEFAULT TRUE
If you might involve multiple times zones in any way, use timestamptz instead of timestamp here:
created timestamp without time zone NOT NULL DEFAULT now(),
The default now() produces timestamptz and after the assignment cast results in the current time according to the time zone of the session. I.e., the value changes with the timezone of the session, which is a sneaky point of failure. See:
- Ignoring time zones altogether in Rails and PostgreSQL
And:
createdby numeric(10,0) NOT NULL
et al. look like they should really be just integer. (Or maybe bigint if you really think you might burn through more than 2147483648 numbers ...)
Query
Looking at the first subquery:
SELECT gel5.ganado_id, gel5.estado_leche
FROM agroapp.ganado_estado_leche gel5
INNER JOIN (
SELECT MAX(gel3.ganado_estado_leche_id) ganado_estado_leche_id
FROM agroapp.ganado_estado_leche gel3
INNER JOIN (
SELECT gel.ganado_id, MAX(gel.created) created
FROM agroapp.ganado_estado_leche gel
GROUP BY gel.ganado_id
) gel2 ON (gel2.ganado_id = gel3.ganado_id AND gel2.created = gel3.created)
GROUP BY gel3.ganado_id
) gel4 ON gel4.ganado_estado_leche_id = gel5.ganado_estado_leche_id
The innermost subquery gets the max. created per ganado_id, the next one the max ganado_estado_leche_id of those rows. And finally you join back and retrieve all ganado_id that appear in combination with the identified max ganado_estado_leche_id per partition. I have a hard time making sense of this, but it can be simplified to:
SELECT gel2.ganado_id
FROM agroapp.ganado_estado_leche gel2
JOIN (
SELECT DISTINCT ON (ganado_id) ganado_estado_leche_id
FROM agroapp.ganado_estado_leche
ORDER BY ganado_id, created DESC NULLS LAST, ganado_estado_leche_id DESC NULLS LAST
) gel1 USING (ganado_estado_leche_id)
See:
Select first row in each GROUP BY group?
Looks like an incorrect query to me. Same with the rest of the query: the joins multiply rows in an odd fashion. Not sure what you are trying to count, but I doubt the query counts just that. You did not provide enough information to make sense of it.
Related
A view that shows the name of the server, the id of the instance and the number of active sessions (a session is active if the end timestamp is null)
CREATE TABLE instances( ser_name VARCHAR(20) NOT NULL, id INTEGER NOT NULL , ser_ip VARCHAR(16) NOT NULL, status VARCHAR(10) NOT NULL, creation_ts TIMESTAMP, CONSTRAINT instance_id PRIMARY KEY(id) ); CREATE TABLE characters( nickname VARCHAR(15) NOT NULL, type VARCHAR(10) NOT NULL, c_level INTEGER NOT NULL, game_data VARCHAR(40) NOT NULL, start_ts TIMESTAMP , end_ts TIMESTAMP NULL , player_ip VARCHAR(16) NOT NULL, instance_id INTEGER NOT NULL, player_username VARCHAR(15), CONSTRAINT chara_nick PRIMARY KEY(nickname) ); ALTER TABLE instances ADD CONSTRAINT ins_ser_name FOREIGN KEY(ser_name) REFERENCES servers(name); ALTER TABLE instances ADD CONSTRAINT ins_ser_ip FOREIGN KEY(ser_ip) REFERENCES servers(ip); ALTER TABLE characters ADD CONSTRAINT chara_inst_id FOREIGN KEY(instance_id) REFERENCES instances(id); ALTER TABLE characters ADD CONSTRAINT chara_player_username FOREIGN KEY(player_username) REFERENCES players(username); insert into instances values ('serverA','1','138.201.233.18','active','2020-10-20'), ('serverB','2','138.201.233.19','active','2020-10-20'), ('serverE','3','138.201.233.14','active','2020-10-20'); insert into characters values ('characterA','typeA','1','Game data of characterA','2020-07-18 02:12:12','2020-07-18 02:32:30','192.188.11.1','1','nabin123'), ('characterB','typeB','3','Game data of characterB','2020-07-19 02:10:12',null,'192.180.12.1','2','rabin123'), ('characterC','typeC','1','Game data of characterC','2020-07-18 02:12:12',null,'192.189.10.1','3','sabin123'), ('characterD','typeA','1','Game data of characterD','2020-07-18 02:12:12','2020-07-18 02:32:30','192.178.11.1','2','nabin123'), ('characterE','typeB','3','Game data of characterE','2020-07-19 02:10:12',null,'192.190.12.1','1','rabin123'), ('characterF','typeC','1','Game data of characterF','2020-07-18 02:12:12',null,'192.188.10.1','3','sabin123'), ('characterG','typeD','1','Game data of characterG','2020-07-18 02:12:12',null,'192.188.13.1','1','nabin123'), ('characterH','typeD','3','Game data of characterH','2020-07-19 02:10:12',null,'192.180.17.1','2','bipin123'), ('characterI','typeD','1','Game data of characterI','2020-07-18 02:12:12','2020-07-18 02:32:30','192.189.18.1','3','dhiraj123'), ('characterJ','typeD','3','Game data of characterJ','2020-07-18 02:12:12',null,'192.178.19.1','2','prabin123'), ('characterK','typeB','4','Game data of characterK','2020-07-19 02:10:12','2020-07-19 02:11:30','192.190.20.1','1','rabin123'), ('characterL','typeC','2','Game data of characterL','2020-07-18 02:12:12',null,'192.192.11.1','3','sabin123'), ('characterM','typeC','3','Game data of characterM','2020-07-18 02:12:12',null,'192.192.11.1','2','sabin123'); here I need a view that shows the name of the server, the id of the instance and the number of active sessions (a session is active if the end timestamp is null). do my code wrong or something else? i am starting to learn so hoping for positive best answers. my view create view active_sessions as select i.ser_name, i.id, count(end_ts) as active from instances i, characters c where i.id=c.instance_id and c.end_ts = null group by i.ser_name, i.id;
This does not do what you want: where i.id = c.instance_id and c.end_ts = null Nothing is equal to null. You need is null to check a value against null. Also, count(end_ts) will always produce 0, as we know already that end_ts is null, which count() does not consider. Finally, I would highly recommend using a standard join (with the on keyword), rather than an implicit join (with a comma in the from clause): this old syntax from decades ago should not be used in new code. I think that a left join is closer to what you want (it would also take in account instances that have no character at all). So: create view active_sessions as select i.ser_name, i.id, count(c.nickname) as active from instances i left join characters c on i.id = c.instance_id and c.end_ts is null group by i.ser_name, i.id;
Postgres remove duplicates (multiple columns) in order to add unique constraint
I have a table: CREATE TABLE public.assignment ( id integer NOT NULL, dining_table_id integer NOT NULL, guest_group_id integer NOT NULL, start_timestamp timestamp without time zone DEFAULT '1999-01-01 00:00:00'::timestamp without time zone NOT NULL, end_timestamp timestamp without time zone DEFAULT '1999-01-02 00:00:00'::timestamp without time zone NOT NULL, assignment_related_id text ); When I add an unique constraint: ALTER TABLE assignment ADD CONSTRAINT unique_assignment UNIQUE (dining_table_id, guest_group_id, start_timestamp, end_timestamp); I get: ERROR: could not create unique index "unique_assignment" DETAIL: Key (dining_table_id, guest_group_id, start_timestamp, end_timestamp)=(1433, 101476, 2019-07-16 18:30:00, 2019-07-16 20:30:00) is duplicated. So how can I delete all duplicates, which have the same values in the concerning columns.
DELETE FROM assignment WHERE id IN (SELECT id FROM (SELECT id, ROW_NUMBER() OVER (partition BY dining_table_id, guest_group_id, start_timestamp, end_timestamp ORDER BY id) AS rnum FROM assignment) t WHERE t.rnum > 1);
Query execution time increased dramatically without type cast
The query in this state takes more than 5 minutes to execute. If I remove any of the ::DATE conversions (see comment in code) the execution time goes < 500 ms. For example, if I change gf.created::DATE to gf.created the performance is dramatically increased. Same happens if I change gtg.created::DATE to gtg.created. Why is there a huge difference when using both ::DATE conversions if each shows great performance on its own? SELECT gtg6.tipo_ganado, COUNT(gtg6.tipo_ganado) animales FROM agroapp.ganado g INNER JOIN (SELECT gf5.ganado_id, gf5.fundo_id FROM agroapp.ganado_fundo gf5 INNER JOIN (SELECT MAX(gf3.ganado_fundo_id) ganado_fundo_id FROM agroapp.ganado_fundo gf3 INNER JOIN (SELECT gf.ganado_id, MAX(gf.created) created FROM agroapp.ganado_fundo gf WHERE gf.isactive = 'Y' -- HERE CHANGING gf.created::DATE TO gf.created AND gf.created::DATE <= '20181030'::DATE GROUP BY gf.ganado_id) gf2 ON (gf2.ganado_id = gf3.ganado_id AND gf2.created = gf3.created) WHERE gf3.isactive = 'Y' GROUP BY gf3.ganado_id) gf4 ON gf4.ganado_fundo_id = gf5.ganado_fundo_id ) gf6 ON gf6.ganado_id = g.ganado_id INNER JOIN (SELECT gtg5.ganado_id, gtg5.tipo_ganado FROM agroapp.ganado_tipo_ganado gtg5 INNER JOIN (SELECT MAX(gtg3.ganado_tipo_ganado_id) ganado_tipo_ganado_id FROM agroapp.ganado_tipo_ganado gtg3 INNER JOIN (SELECT gtg.ganado_id, MAX(gtg.created) created FROM agroapp.ganado_tipo_ganado gtg WHERE gtg.isactive = 'Y' -- OR HERE CHANGING gtg.created::DATE TO gtg.created AND gtg.created::DATE <= '20181030'::DATE GROUP BY gtg.ganado_id) gtg2 ON (gtg2.ganado_id = gtg3.ganado_id AND gtg2.created = gtg3.created) WHERE gtg3.isactive = 'Y' GROUP BY gtg3.ganado_id) gtg4 ON gtg4.ganado_tipo_ganado_id = gtg5.ganado_tipo_ganado_id ) gtg6 ON gtg6.ganado_id = g.ganado_id WHERE g.organizacion_id = 21 GROUP BY gtg6.tipo_ganado ORDER BY gtg6.tipo_ganado; Table definitions All 3 tables have around 50000 rows: CREATE TABLE agroapp.ganado_fundo ( ganado_fundo_id serial NOT NULL, organizacion_id integer NOT NULL, isactive character(1) NOT NULL DEFAULT 'Y'::bpchar, created timestamp without time zone NOT NULL DEFAULT now(), createdby numeric(10,0) NOT NULL, updated timestamp without time zone NOT NULL DEFAULT now(), updatedby numeric(10,0) NOT NULL, fundo_id integer NOT NULL, ganado_id integer NOT NULL, CONSTRAINT ganado_fundo_pk PRIMARY KEY (ganado_fundo_id), CONSTRAINT ganado_fk FOREIGN KEY (ganado_id) REFERENCES agroapp.ganado (ganado_id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION ) CREATE TABLE agroapp.ganado_tipo_ganado ( ganado_tipo_ganado_id serial NOT NULL, organizacion_id integer NOT NULL, isactive character(1) NOT NULL DEFAULT 'Y'::bpchar, created timestamp without time zone NOT NULL DEFAULT now(), createdby numeric(10,0) NOT NULL, updated timestamp without time zone NOT NULL DEFAULT now(), updatedby numeric(10,0) NOT NULL, tipo_ganado character varying(80) NOT NULL, ganado_id integer NOT NULL, CONSTRAINT ganado_tipo_ganado_pk PRIMARY KEY (ganado_tipo_ganado_id), CONSTRAINT ganado_fk FOREIGN KEY (ganado_id) REFERENCES agroapp.ganado (ganado_id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION ) CREATE TABLE agroapp.ganado ( ganado_id serial NOT NULL, organizacion_id integer NOT NULL, isactive character(1) NOT NULL DEFAULT 'Y'::bpchar, created timestamp without time zone NOT NULL DEFAULT now(), createdby numeric(10,0) NOT NULL, updated timestamp without time zone NOT NULL DEFAULT now(), updatedby numeric(10,0) NOT NULL, fecha_nacimiento timestamp without time zone NOT NULL, tipo_ganado character varying(80) NOT NULL, diio_id integer NOT NULL, fundo_id integer NOT NULL, raza_id integer NOT NULL, estado_reproductivo character varying(80) NOT NULL, estado_leche character varying(80), CONSTRAINT ganado_pk PRIMARY KEY (ganado_id), CONSTRAINT diio_fk FOREIGN KEY (diio_id) REFERENCES agroapp.diio (diio_id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION, CONSTRAINT fundo_fk FOREIGN KEY (fundo_id) REFERENCES agroapp.fundo (fundo_id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION, CONSTRAINT raza_fk FOREIGN KEY (raza_id) REFERENCES agroapp.raza (raza_id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION )
Most probably because the forced cast voids the option to use an index on the column agroapp.ganado_fundo.created Guessing (for lack of information) that gf.created is of type timestamp with time zone (or timestamp), replace AND gf.created::DATE <= '20181030'::DATE with: AND gf.created < '2018-10-31'::timestamp -- match the data type of the column! to achieve the same result, but with index support. If you operate with timestamtptz, be aware of implications on the date: it depends on the current time zone. Details: Ignoring time zones altogether in Rails and PostgreSQL
How to group by an attribute and order by date
I have two tables: Medics CREATE TABLE "medic" ( "id" BIGINT NOT NULL, "name" CHARACTER VARYING(255) NOT NULL, PRIMARY KEY ("id") ); Comments CREATE TABLE IF NOT EXISTS "comment" ( "id" BIGINT NOT NULL, "medic_id" BIGINT NOT NULL, "comment" CHARACTER VARYING(1024) NOT NULL, "created_at" TIMESTAMP WITHOUT TIME ZONE NOT NULL DEFAULT now(), CONSTRAINT pk_comment PRIMARY KEY (id), CONSTRAINT fk_comment_medic FOREIGN KEY (medic_id) REFERENCES medic(id) ON UPDATE NO ACTION ON DELETE NO ACTION ); Now I want to get medic_id, name, comments_count and all ordered by created_at Here's what I've tried so far: SELECT m.id, m.name, COUNT(c.id) FROM COMMENT AS c JOIN medic AS m ON m.id = c.medic_id GROUP BY m.id, m.name, c.created_at ORDER BY c.created_at DESC But obviously this can't work because it makes no sense to group by date although I have to do it when I want to order by date. Another appraoch was to work with window functions. Particularly rank() over (partition by m.id order by c.created_at desc). But in this case I lose the ordering over all records. Here's some SQLFiddle. I am using Postgres 9.3
I'm guessing you want to order by the most recent comment date: SELECT m.id, m.name, COUNT(c.id) FROM COMMENT c JOIN medic m ON m.id = c.medic_id GROUP BY m.id, m.name ORDER BY MAX(c.created_at) DESC;
Alter command is taking long time, but not executing in PostgreSQL
The below ALTER command is taking long time, but not executing. alter table DETAILS alter column row_id type numeric(20); DDL is as follows: CREATE TABLE Details ( row_id numeric(15,0) NOT NULL, intfid character varying(20) NOT NULL, seqno numeric(15,0) NOT NULL, record_id numeric(15,0) NOT NULL, lstmoddate timestamp without time zone NOT NULL, rcvddate timestamp without time zone NOT NULL DEFAULT current_date, record_type character varying(60), xmldata bytea, CONSTRAINT mrd_pk PRIMARY KEY (rcvddate, intfid, seqno, record_id) )