Is there a similar function in postgresql for mysql's SQL_CALC_FOUND_ROWS? - postgresql

everybody using mysql knows:
SELECT SQL_CALC_FOUND_ROWS ..... FROM table WHERE ... LIMIT 5, 10;
and right after run this :
SELECT FOUND_ROWS();
how do i do this in postrgesql? so far, i found only ways where i have to send the query twice...

No, there is not (at least not as of July 2007). I'm afraid you'll have to resort to:
BEGIN ISOLATION LEVEL SERIALIZABLE;
SELECT id, username, title, date FROM posts ORDER BY date DESC LIMIT 20;
SELECT count(id, username, title, date) AS total FROM posts;
END;
The isolation level needs to be SERIALIZABLE to ensure that the query does not see concurrent updates between the SELECT statements.
Another option you have, though, is to use a trigger to count rows as they're INSERTed or DELETEd. Suppose you have the following table:
CREATE TABLE posts (
id SERIAL PRIMARY KEY,
poster TEXT,
title TEXT,
time TIMESTAMPTZ DEFAULT now()
);
INSERT INTO posts (poster, title) VALUES ('Alice', 'Post 1');
INSERT INTO posts (poster, title) VALUES ('Bob', 'Post 2');
INSERT INTO posts (poster, title) VALUES ('Charlie', 'Post 3');
Then, perform the following to create a table called post_count that contains a running count of the number of rows in posts:
-- Don't let any new posts be added while we're setting up the counter.
BEGIN;
LOCK TABLE posts;
-- Create and initialize our post_count table.
SELECT count(*) INTO TABLE post_count FROM posts;
-- Create the trigger function.
CREATE FUNCTION post_added_or_removed() RETURNS TRIGGER AS $$
BEGIN
IF TG_OP = 'DELETE' THEN
UPDATE post_count SET count = count - 1;
ELSIF TG_OP = 'INSERT' THEN
UPDATE post_count SET count = count + 1;
END IF;
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
-- Call the trigger function any time a row is inserted.
CREATE TRIGGER post_added_or_removed_tgr
AFTER INSERT OR DELETE
ON posts
FOR EACH ROW
EXECUTE PROCEDURE post_added_or_removed();
COMMIT;
Note that this maintains a running count of all of the rows in posts. To keep a running count of certain rows, you'll have to tweak it:
SELECT count(*) INTO TABLE post_count FROM posts WHERE poster <> 'Bob';
CREATE FUNCTION post_added_or_removed() RETURNS TRIGGER AS $$
BEGIN
-- The IF statements are nested because OR does not short circuit.
IF TG_OP = 'DELETE' THEN
IF OLD.poster <> 'Bob' THEN
UPDATE post_count SET count = count - 1;
END IF;
ELSIF TG_OP = 'INSERT' THEN
IF NEW.poster <> 'Bob' THEN
UPDATE post_count SET count = count + 1;
END IF;
END IF;
RETURN NULL;
END;
$$ LANGUAGE plpgsql;

There is a simple way, but keep in mind, that following COUNT(*) aggr function will be applied to all rows returned after where and before limit/offset (may be costy)
SELECT
id,
"count" (*) OVER () AS cnt
FROM
objects
WHERE
id > 2
OFFSET 50
LIMIT 5

No, PostgreSQL doesn't try to count all relevant results when you only need 10 results. You need a seperate COUNT to count all results.

Related

How can I write a trigger that gets the last inserted row into the table?

I was to populate a field is_continued_post if some conditions are true about the previously inserted row into the table (it's the same user, and it's inserted_at is less than N mins from the new rows inserted_at).
When a new comment is inserted into the database. I want to get the last comment (with the same post_id) that was inserted, then check that the old rows user_id are the same as the new rows user_id, and that the old row was inserted less than 2 mins before the new row. If this is true, I want to flip a boolean on the new row to true before inserting it.
Is this possible with Postgresql triggers? Or is there a better way to do this?
This is what I've come up with so far:
CREATE OR REPLACE FUNCTION update_message_cont()
RETURNS trigger AS $$
BEGIN
old := (SELECT m0.user_id, m0.inserted_at FROM messages AS m0 WHERE (m0.post_id = NEW.post_id) ORDER BY m0.inserted_at DESC LIMIT 1);
NEW.is_continued := CASE
WHEN old is NULL THEN FALSE
WHEN old.user_id = NEW.user_id AND ((NEW.inserted_at - old.inserted_at) < 120) THEN TRUE
END;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
Yes, that is possible, but only if you have a column in the table that allows you to identify the last inserted row. The order of insertion is not reflected in the table as such.
So introduce a column
inserted_at timestamp with time zone DEFAULT clock_timestamp() NOT NULL
An index on (post_id, inserted_at) will make the query fast.
The whole trigger could look like:
CREATE FUNCTION update_message_cont() RETURNS trigger AS
$$BEGIN
SELECT user_id IS NOT DISTINCT FROM NEW.user_id INTO NEW.is_continued
FROM messages
WHERE post_id = NEW.post_id
AND inserted_at > NEW.inserted_at - INTERVAL '120 seconds'
ORDER BY inserted_at DESC
LIMIT 1;
-- if no previous row was found:
IF NEW.is_continued IS NULL THEN
NEW.is_continued = FALSE;
END IF;
RETURN NEW;
END;$$ LANGUAGE plpgsql;
CREATE TRIGGER update_message_cont BEFORE INSERT ON messages
FOR EACH ROW EXECUTE PROCEDURE update_message_cont();

Update table every 1000 rows

I am trying to do an update on a specific record every 1000 rows using Postgres. I am looking for a better way to do that. My function is described below:
CREATE OR REPLACE FUNCTION update_row()
RETURNS void AS
$BODY$
declare
myUID integer;
nRow integer;
maxUid integer;
BEGIN
nRow:=1000;
select max(uid_atm_inp) from tab into maxUid where field1 = '1240200';
loop
if (nRow > 1000 and nRow < maxUid) then
select uid from tab into myUID where field1 = '1240200' and uid >= nRow limit 1;
update tab
set field = 'xxx'
where field1 = '1240200' and uid = myUID;
nRow:=nRow+1000;
end if;
end loop;
END; $BODY$
LANGUAGE plpgsql VOLATILE
How can I improve this procedure? I think there is something wrong. The loop does not end and takes too much time.
To perform this task in SQL, you could use the row_number window function and update only those rows where the number is divisible by 1000.
Your loop doesn't finish because there is no EXIT or RETURN in it.
I doubt you could ever rival the performance of a standard SQL update with a procedural loop. Instead of doing it a row at a time, just do it all as a single statement:
with t2 as (
select
uid, row_number() over (order by 1) as rn
from tab
where field1 = '1240200'
)
update tab t1
set field = 'xxx'
from t2
where
t1.uid = t2.uid and
mod (t2.rn, 1000) = 0
Per my comment, I am presupposing what you mean by "every 1000th row," as without some designation of how to determine what tuple is what row number. That is easily edited by changing the "order by" criteria.
Adding a second where clause on the update (t1.field1 = '1240200') can't hurt but might not be necessary if these are nested loop.
This might be notionally similar to what Laurenz has in mind.
I solved this way:
declare
myUID integer;
nRow integer;
rowNum integer;
checkrow integer;
myString varchar(272);
cur_check_row cursor for select uid , row_number() over (order by 1) as rn, substr(fieldxx,1,244)
from table where field1 = '1240200' and uid >= 1000 ORDER BY uid;
BEGIN
open cur_check_row;
loop
fetch cur_check_row into myUID, rowNum, myString;
EXIT WHEN NOT FOUND;
select mod(rowNum, 1000) into checkrow;
if checkrow = 0 then
update table
set fieldxx= myString||'O'
where uid in (myUID);
end if;
end loop;
close cur_check_row;

Postgresql same date record insert

I am using a Postgresql 9.5 database. A third party software application is also using this database. I have a Features table. I created an Events table to record Features events.
Features
------------
id name lon lat
1 x 0 10
2 y 15 20
When I create a record in the Features table, my trigger inserts a record into the Events table.
Events
id name date feature_id
1 insert 09.04.2018 14:22:23.065125 1
When I update Features name, lon and lat and save it, the software execution results in 3 update records at same time.
Events
id name date feature_id
1 insert 09.04.2018 14:22:23.065125 1
2 update 09.04.2018 18:15:41.099689 1
3 update 09.04.2018 18:15:41.099689 1
4 update 09.04.2018 18:15:41.099689 1
But this is 3 update is same values.
How can I restrict this in my trigger?
My trigger function:
CREATE FUNCTION event_fn() AS $BODY$ BEGIN
IF TG_OP = 'INSERT' THEN
INSERT INTO events (event_name, event_date, feature_id) VALUES ('insert', now(), NEW.id);
RETURN NEW;
END IF;
IF TG_OP = 'UPDATE' THEN
INSERT INTO events (event_name, event_date, feature_id) VALUES ('update', now(), NEW.id);
RETURN NEW;
END IF;
IF TG_OP = 'DELETE' THEN
INSERT INTO events (event_name, event_date, feature_id) VALUES ('delete', now(), OLD.id);
RETURN OLD;
END IF;
END;
The best solution would be to opt out of the software that performs several updates instead of actually a single one. However, if you can not do this, you can add a trigger for the events table, e.g.:
create or replace function before_insert_on_events()
returns trigger language plpgsql as $$
begin
if exists (
select 1
from events e
where e.name = new.name
and e.date = new.date
and e.feature_id = new.feature_id)
then new = null;
end if;
return new;
end $$;
create trigger before_insert_on_events
before insert on events
for each row
execute procedure before_insert_on_events();

How to optimize postgresql procedure

I have 61 million of non unique emails with statuses.
This emails need to deduplicate with logic by status.
I write stored procedure, but this procedure runs to long.
How I can optimize execution time of this procedure?
CREATE OR REPLACE FUNCTION public.load_oxy_emails() RETURNS boolean AS $$
DECLARE
row record;
rec record;
new_id int;
BEGIN
FOR row IN SELECT * FROM oxy_email ORDER BY id LOOP
SELECT * INTO rec FROM oxy_emails_clean WHERE email = row.email;
IF rec IS NOT NULL THEN
IF row.status = 3 THEN
UPDATE oxy_emails_clean SET status = 3 WHERE id = rec.id;
END IF;
ELSE
INSERT INTO oxy_emails_clean(id, email, status) VALUES(nextval('oxy_emails_clean_id_seq'), row.email, row.status);
SELECT currval('oxy_emails_clean_id_seq') INTO new_id;
INSERT INTO oxy_emails_clean_websites_relation(oxy_emails_clean_id, website_id) VALUES(new_id, row.website_id);
END IF;
END LOOP;
RETURN true;
END;
$$
LANGUAGE 'plpgsql';
How I can optimize execution time of this procedure?
Don't do it with a loop.
Doing a row-by-row processing (also known as "slow-by-slow") is almost always a lot slower then doing bulk changes where a single statement processes a lot of rows "in one go".
The change of the status can easily be done using a single statement:
update oxy_emails_clean oec
SET status = 3
from oxy_email oe
where oe.id = oec.id
and oe.status = 3;
The copying of the rows can be done using a chain of CTEs:
with to_copy as (
select *
from oxy_email
where status <> 3 --<< all those that have a different status
), clean_inserted as (
INSERT INTO oxy_emails_clean (id, email, status)
select nextval('oxy_emails_clean_id_seq'), email, status
from to_copy
returning id;
)
insert oxy_emails_clean_websites_relation (oxy_emails_clean_id, website_id)
select ci.id, tc.website_id
from clean_inserted ci
join to_copy tc on tc.id = ci.id;

postgres count from table efficient way

In my application we are using postgresql,now it has one million records in summary table.
When I run the following query it takes 80,927 ms
SELECT COUNT(*) AS count
FROM summary_views
GROUP BY question_id,category_type_id
Is there any efficient way to do this?
COUNT(*) in PostgreSQL tends to be slow. It's a feature of MVCC. One of the workarounds of the problem is a row counting trigger with a helper table:
create table table_count(
table_count_id text primary key,
rows int default 0
);
CREATE OR REPLACE FUNCTION table_count_update()
RETURNS trigger AS
$BODY$
begin
if tg_op = 'INSERT' then
update table_count set rows = rows + 1
where table_count_id = TG_TABLE_NAME;
elsif tg_op = 'DELETE' then
update table_count set rows = rows - 1
where table_count_id = TG_TABLE_NAME;
end if;
return null;
end;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
Next step is to add proper trigger declaration for each table you'd like to use it with. For example for table tab_name:
begin;
insert into table_count values
('tab_name',(select count(*) from tab_name));
create trigger tab_name_table_count after insert or delete
on tab_name for each row execute procedure table_count_update();
commit;
It is important to run in a transaction block to keep actual count and helper table in sync in case of delete or insert between initial count and trigger creation. Transaction guarantees this. From now on to get current count instantly, just invoke:
select rows from table_count where table_count_id = 'tab_name';
Edit: In case of your group by clause, you'll need more sophisticated trigger function and count table.