In my application we are using postgresql,now it has one million records in summary table.
When I run the following query it takes 80,927 ms
SELECT COUNT(*) AS count
FROM summary_views
GROUP BY question_id,category_type_id
Is there any efficient way to do this?
COUNT(*) in PostgreSQL tends to be slow. It's a feature of MVCC. One of the workarounds of the problem is a row counting trigger with a helper table:
create table table_count(
table_count_id text primary key,
rows int default 0
);
CREATE OR REPLACE FUNCTION table_count_update()
RETURNS trigger AS
$BODY$
begin
if tg_op = 'INSERT' then
update table_count set rows = rows + 1
where table_count_id = TG_TABLE_NAME;
elsif tg_op = 'DELETE' then
update table_count set rows = rows - 1
where table_count_id = TG_TABLE_NAME;
end if;
return null;
end;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
Next step is to add proper trigger declaration for each table you'd like to use it with. For example for table tab_name:
begin;
insert into table_count values
('tab_name',(select count(*) from tab_name));
create trigger tab_name_table_count after insert or delete
on tab_name for each row execute procedure table_count_update();
commit;
It is important to run in a transaction block to keep actual count and helper table in sync in case of delete or insert between initial count and trigger creation. Transaction guarantees this. From now on to get current count instantly, just invoke:
select rows from table_count where table_count_id = 'tab_name';
Edit: In case of your group by clause, you'll need more sophisticated trigger function and count table.
Related
A trigger works on the first part of a function but not the second.
I'm trying to set up a trigger that does two things:
Update a field - geom - whenever the fields lat or lon are updated, using those two fields.
Update a field - country - from the geom field by referencing another table.
I've tried different syntaxes of using NEW, OLD, BEFORE and AFTER conditions, but whatever I do, I can only get the first part to work.
Here's the code:
CREATE OR REPLACE FUNCTION update_geometries()
RETURNS TRIGGER
LANGUAGE plpgsql
AS $$
BEGIN
update schema.table a set geom = st_setsrid(st_point(a.lon, a.lat), 4326);
update schema.table a set country = b.name
from reference.admin_layers_0 b where st_intersects(a.geom,b.geom)
and a.pk = new.pk;
RETURN NEW;
END;
$$;
CREATE TRIGGER
geom_update
AFTER INSERT OR UPDATE of lat,lon on
schema.table
FOR EACH STATEMENT EXECUTE PROCEDURE update_geometries();
There is no new on a statement level trigger. (well, there is, but it is always Null)
You can either keep the statement level and update the entire a table, i.e. remove the and a.pk = new.pk, or, if only part of the rows are updated, change the trigger for a row-level trigger and only update the affected rows
CREATE OR REPLACE FUNCTION update_geometries()
RETURNS TRIGGER
LANGUAGE plpgsql
AS $$
BEGIN
NEW.geom = st_setsrid(st_point(NEW.lon, NEW.lat), 4326);
SELECT b.name
INTO NEW.country
FROM reference.admin_layers_0 b
WHERE st_intersects(NEW.geom,b.geom);
RETURN NEW;
END;
$$;
CREATE TRIGGER
geom_update
BEFORE INSERT OR UPDATE of lat,lon on
schema.table
FOR EACH ROW EXECUTE PROCEDURE update_geometries();
I want to create a trigger function, which copies certain columns of an recent updated row and deletes the old data. After that I want to insert the copied columns in exact the same table in the same row (overwrite). I need the data to be INSERTED because this function will be embedded in an existing program, with predefined Triggers.
That's what I have so far:
CREATE OR REPLACE FUNCTION update_table()
RETURNS TRIGGER AS
$func$
BEGIN
WITH tmp AS (DELETE FROM table
WHERE table.id = NEW.id
RETURNING id, geom )
INSERT INTO table (id, geom) SELECT * FROM tmp;
END;
$func$ language plpgsql;
CREATE TRIGGER T_update
AFTER UPDATE OF geom ON table
EXECUTE PROCEDURE update_table();
But I get the Error message:
ERROR: cannot perform DELETE RETURNING on relation "table"
HINT: You need an unconditional ON DELETE DO INSTEAD rule with a RETURNING clause.
Why I should use a rule here?
I'm using PostgreSQL 9.6
UPDATE:
A little bit of clarification. When I have two columns in my table (id, geom), after I updated geom I want to make a copy of this (new)row and insert it into the same table, while overwriting the updated row. (I'm not interested in any value before the update) I know that this is odd but I need this row to be inserted again because the program i embed this function in, listens to a INSERT statement and cannot be changed by me.
Right after you update a row, its old values will no longer be available. So, if you simply want to preserve the old row in case of an update you need to create a BEFORE UPDATE trigger, so that you can still access the OLD values and create a new row, e.g.
CREATE TABLE t (id int, geom geometry(point,4326));
CREATE OR REPLACE FUNCTION update_table() RETURNS TRIGGER AS $$
BEGIN
INSERT INTO t (id, geom) VALUES (OLD.id,OLD.geom);
RETURN NEW;
END; $$ LANGUAGE plpgsql;
CREATE TRIGGER t_update
BEFORE UPDATE OF geom ON t FOR EACH ROW EXECUTE PROCEDURE update_table();
INSERT INTO t VALUES (1,'SRID=4326;POINT(1 1)');
If you update the record 1 ..
UPDATE t SET geom = 'SRID=4326;POINT(2 2)', id = 2 WHERE id = 1;
UPDATE t SET geom = 'SRID=4326;POINT(3 3)', id = 3 WHERE id = 2;
.. you get a new record in the same table as you wished
SELECT id, ST_AsText(geom) FROM t;
id | st_astext
----+------------
1 | POINT(1 1)
2 | POINT(2 2)
3 | POINT(3 3)
Demo: db<>fiddle
Unrelated note: consider upgrading your PostgreSQL version! 9.6 will reach EOL in November, 2021.
First thanks to #JimJones for the answer. I´d like to post his answer modified for this purpose. This code "overwrites" the updated row by inserting a copy of itself and then deleting the old duplicate. That way I can Trigger on INSERT.
CREATE TABLE t (Unique_id SERIAL,id int, geom geometry(point,4326));
CREATE OR REPLACE FUNCTION update_table() RETURNS TRIGGER AS $$
BEGIN
INSERT INTO t (id, geom) VALUES (NEW.id,NEW.geom);
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER t_update
BEFORE UPDATE OF geom ON t FOR EACH ROW EXECUTE PROCEDURE update_table();
CREATE OR REPLACE FUNCTION delete_table() RETURNS TRIGGER AS $$
BEGIN
DELETE FROM t a
USING t b
WHERE a.Unique_id < b.Unique_id
AND a.geom = b.geom;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER t_delete
AFTER UPDATE OF geom ON t FOR EACH ROW EXECUTE PROCEDURE delete_table();
INSERT INTO t VALUES (1,1,'SRID=4326;POINT(1 1)');
UPDATE t SET geom = 'SRID=4326;POINT(2 2)' WHERE id = 1;
Here is what i'm trying to do:
ALTER TABLE publishroomcontacts ADD COLUMN IF NOT EXISTS contactorder integer NOT NULL default 1;
CREATE OR REPLACE FUNCTION publishroomcontactorder() RETURNS trigger AS $publishroomcontacts$
BEGIN
IF (TG_OP = 'INSERT') THEN
with newcontactorder as (SELECT contactorder FROM publishroomcontacts WHERE publishroomid = NEW.publishroomid ORDER BY contactorder limit 1)
NEW.contactorder = (newcontactorder + 1);
END IF;
RETURN NEW;
END;
$publishroomcontacts$ LANGUAGE plpgsql;
CREATE TRIGGER publishroomcontacts BEFORE INSERT OR UPDATE ON publishroomcontacts
FOR EACH ROW EXECUTE PROCEDURE publishroomcontactorder();
I've been looking into a lot of examples and they all look like this. Most of them a couple of years old tho. Has this changed or why doesn't NEW work? And do i have to do the insert in the function or does postgres do the insert with the returned NEW object after the function is done?
I'm not sure what you're trying to do, but your syntax is wrong here:
with newcontactorder as (SELECT contactorder FROM publishroomcontacts WHERE publishroomid = NEW.publishroomid ORDER BY contactorder limit 1)
NEW.contactorder = (newcontactorder + 1);
Do not use CTE query if there is no select that comes afterwards. If you want to increment contactorder column for particular publishroomid whenever new one is being added and this is your sequence (auto increment) mechanism then you should replace it with:
NEW.contactorder = COALESCE((
SELECT max(contactorder)
FROM publishroomcontacts
WHERE publishroomid = NEW.publishroomid
), 1);
Note the changes:
there's no CTE, just variable assignment with SELECT query
use MAX() aggregate function instead of ORDER BY + LIMIT
wrapped up with COALESCE(x,1) function to properly insert first contacts for rooms, it will return 1 if your query does return NULL
Your trigger should look like this
CREATE OR REPLACE FUNCTION publishroomcontactorder() RETURNS trigger AS $publishroomcontacts$
BEGIN
IF (TG_OP = 'INSERT') THEN
NEW.contactorder = COALESCE((
SELECT max(contactorder) + 1
FROM publishroomcontacts
WHERE publishroomid = NEW.publishroomid
), 1);
END IF;
RETURN NEW;
END;
$publishroomcontacts$ LANGUAGE plpgsql;
Postgres will insert the row itself, you don't have to do anything, because RETURN NEW does that.
This solution does not take care of concurrent inserts which makes it unsafe for multi-user environment! You can work around this by performing an UPSERT !
WITH is not an assignment in PL/pgSQL.
PL/pgSQL interprets the line as SQL statement, but that is bad SQL because the WITH clause is followed by NEW.contactorder rather than SELECT or another CTE.
Hence the error; it has nothing to do with NEW as such.
You probably want something like
SELECT contactorder INTO newcontactorder
FROM publishroomcontacts
WHERE publishroomid = NEW.publishroomid
ORDER BY contactorder DESC -- you want the biggest one, right?
LIMIT 1;
You'll have to declare newcontactorder in the DECLARE section.
Warning: If there are two concurrent inserts, they might end up with the same newcontactorder.
I have a test table with three columns (file, qty, qty_total). I will input multiple rows like this for example, insert into test_table (file,qty) VALUS (A,5);. What i want is for on commit is for a trigger to take the value from qty and add it to qty_total. As what will happen is that this value will get updated as this example demonstrates. Update test_table set qty = 10 where file = A; So the qty_total is now 15. Thanks
Managed to solve this myself. I created a trigger function `CREATE FUNCTION public.qty_total()
RETURNS trigger
LANGUAGE 'plpgsql'
COST 100.0
VOLATILE NOT LEAKPROOF
AS $BODY$
BEGIN
IF TG_OP = 'UPDATE' THEN
NEW."total" := (OLD.total + NEW.col2);
RETURN NEW;
ELSE
NEW."total" := NEW.col2;
RETURN NEW;
END IF;
END;
$BODY$;
ALTER FUNCTION public.qty_total()
OWNER TO postgres; This was called by a trigger CREATE TRIGGER qty_trigger
BEFORE INSERT OR UPDATE
ON public.test
FOR EACH ROW
EXECUTE PROCEDURE qty_total(); now when i insert a new code and value, the value is copied to the total, when it is updated, the value is added to the total and i have my new qty_total. This may not have the best error catching in it, but since i am passing the data from php, i am happy to make sure the errors are caught and removed.
everybody using mysql knows:
SELECT SQL_CALC_FOUND_ROWS ..... FROM table WHERE ... LIMIT 5, 10;
and right after run this :
SELECT FOUND_ROWS();
how do i do this in postrgesql? so far, i found only ways where i have to send the query twice...
No, there is not (at least not as of July 2007). I'm afraid you'll have to resort to:
BEGIN ISOLATION LEVEL SERIALIZABLE;
SELECT id, username, title, date FROM posts ORDER BY date DESC LIMIT 20;
SELECT count(id, username, title, date) AS total FROM posts;
END;
The isolation level needs to be SERIALIZABLE to ensure that the query does not see concurrent updates between the SELECT statements.
Another option you have, though, is to use a trigger to count rows as they're INSERTed or DELETEd. Suppose you have the following table:
CREATE TABLE posts (
id SERIAL PRIMARY KEY,
poster TEXT,
title TEXT,
time TIMESTAMPTZ DEFAULT now()
);
INSERT INTO posts (poster, title) VALUES ('Alice', 'Post 1');
INSERT INTO posts (poster, title) VALUES ('Bob', 'Post 2');
INSERT INTO posts (poster, title) VALUES ('Charlie', 'Post 3');
Then, perform the following to create a table called post_count that contains a running count of the number of rows in posts:
-- Don't let any new posts be added while we're setting up the counter.
BEGIN;
LOCK TABLE posts;
-- Create and initialize our post_count table.
SELECT count(*) INTO TABLE post_count FROM posts;
-- Create the trigger function.
CREATE FUNCTION post_added_or_removed() RETURNS TRIGGER AS $$
BEGIN
IF TG_OP = 'DELETE' THEN
UPDATE post_count SET count = count - 1;
ELSIF TG_OP = 'INSERT' THEN
UPDATE post_count SET count = count + 1;
END IF;
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
-- Call the trigger function any time a row is inserted.
CREATE TRIGGER post_added_or_removed_tgr
AFTER INSERT OR DELETE
ON posts
FOR EACH ROW
EXECUTE PROCEDURE post_added_or_removed();
COMMIT;
Note that this maintains a running count of all of the rows in posts. To keep a running count of certain rows, you'll have to tweak it:
SELECT count(*) INTO TABLE post_count FROM posts WHERE poster <> 'Bob';
CREATE FUNCTION post_added_or_removed() RETURNS TRIGGER AS $$
BEGIN
-- The IF statements are nested because OR does not short circuit.
IF TG_OP = 'DELETE' THEN
IF OLD.poster <> 'Bob' THEN
UPDATE post_count SET count = count - 1;
END IF;
ELSIF TG_OP = 'INSERT' THEN
IF NEW.poster <> 'Bob' THEN
UPDATE post_count SET count = count + 1;
END IF;
END IF;
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
There is a simple way, but keep in mind, that following COUNT(*) aggr function will be applied to all rows returned after where and before limit/offset (may be costy)
SELECT
id,
"count" (*) OVER () AS cnt
FROM
objects
WHERE
id > 2
OFFSET 50
LIMIT 5
No, PostgreSQL doesn't try to count all relevant results when you only need 10 results. You need a seperate COUNT to count all results.