I have PostgreSQL function which is used for counting usage of "items" by users.
Counter values are saved into table:
users_items
user_id - integer (fk)
item_id - integer (fk)
counter - integer
There is max. 1 counter per user per item (unique key).
Here is my function:
CREATE OR REPLACE FUNCTION increment_favorite_user_item (item integer, userid integer) RETURNS integer AS
$BODY$
DECLARE
new_count integer; -- Usage counter
BEGIN
IF NOT EXISTS(SELECT 1 FROM users_items WHERE user_id = userid AND item_id = itemid)
THEN
INSERT INTO users_items ("user_id", "item_id", "counter") VALUES (userid, itemid, 1); -- First usage - create new counter
new_amount = 1;
ELSE
UPDATE users_items SET count = count + 1 WHERE (user_id = userid AND item_id = itemid); -- Increment counter
SELECT counter INTO new_count FROM users_items WHERE (user_id = userid AND item_id = itemid);
END IF;
RETURN new_count;
END;
$BODY$
LANGUAGE 'plpgsql'
VOLATILE;
It is used by application, which may call it multiple times.
Everything works fine until we call the function one after another, for the same user and item, when the item is new for specific user (record in users_items table does not exist).
For second function call, I get unique violation: "Key (user_id, item_id)=(1, 7912) already exists".
It seems like "if not exists" check doesn't work properly, second function call doesn't see record inserted by first one, and tries to insert same row, making the uq check fail.
What can I do to solve the problem?
Every function call runs in another transaction.
There is a) race condition, b) you should to LOCK table if you would to ensure INSERT
DECLARE rc int;
BEGIN
LOCK TABLE users IN SHARE ROW EXCLUSIVE MODE;
UPDATE users SET counter = counter + 1 WHERE user_id = $1;
GET DIAGNOSTICS rc = ROW_COUNT;
IF rc = 0 THEN
INSERT INTO users(id, counter) VALUES($1, 1)
END IF;
END;
or more complex code, but with less locking
DECLARE rc int;
BEGIN
-- fast path
UPDATE users SET counter = counter + 1 WHERE user_id = $1;
GET DIAGNOSTICS rc = ROW_COUNT;
IF rc = 0 THEN
LOCK TABLE users IN SHARE ROW EXCLUSIVE MODE;
UPDATE users SET counter = counter + 1 WHERE user_id = $1;
GET DIAGNOSTICS rc = ROW_COUNT;
IF rc = 0 THEN
INSERT INTO users(id, counter) VALUES($1, 1)
END IF;
END IF;
END;
Related
I'm trying to keey track of a clients database with which we sync. I need to record records_added (INSERTs) and records_updated (UPDATEs) to our table.
I'm using an UPSERT to handle the sync, and a trigger to update a table keeping track of insert/updates.
The issue is counting records that have are updated. I have 40+ columns to check, do I have to put all these in my check logic? Is there a more elegant way?
Section of code in question:
select
case
when old.uuid = new.uuid
and (
old.another_field != new.another_field,
old.and_another_field != new.and_another_field,
-- many more columns here << This is particularly painful
) then 1
else 0
end into update_count;
Reproducible example:
-- create tables
CREATE TABLE IF NOT EXISTS example (uuid serial primary key, another_field int, and_another_field int);
CREATE TABLE IF NOT EXISTS tracker_table (
records_added integer DEFAULT 0,
records_updated integer DEFAULT 0,
created_at date unique
);
-- create function
CREATE OR REPLACE FUNCTION update_records_inserted () RETURNS TRIGGER AS $body$
DECLARE update_count INT;
DECLARE insert_count INT;
BEGIN
-- ---------------- START OF BLOCK IN QUESTION -----------------
select
case
when old.uuid = new.uuid
and (
old.another_field != new.another_field
-- many more columns here
) then 1
else 0
end into update_count;
-- ------------------ END OF BLOCK IN QUESTION ------------------
-- count INSERTs
select
case
when old.uuid is null
and new.uuid is not null then 1
else 0
end into insert_count;
-- --log the counts
-- raise notice 'update %', update_count;
-- raise notice 'insert %', insert_count;
-- insert or update count to tracker table
insert into
tracker_table(
created_at,
records_added,
records_updated
)
VALUES
(CURRENT_DATE, insert_count, update_count) ON CONFLICT (created_at) DO
UPDATE
SET
records_added = tracker_table.records_added + insert_count,
records_updated = tracker_table.records_updated + update_count;
RETURN NEW;
END;
$body$ LANGUAGE plpgsql;
-- Trigger
DROP TRIGGER IF EXISTS example_trigger ON example;
CREATE TRIGGER example_trigger
AFTER
INSERT
OR
UPDATE
ON example FOR EACH ROW EXECUTE PROCEDURE update_records_inserted ();
-- A query to insert, then update when number of uses > 1
insert into example(whatever) values (2, 3) ON CONFLICT(uuid) DO UPDATE SET another_field=excluded.another_field+1;
I am trying to do an update on a specific record every 1000 rows using Postgres. I am looking for a better way to do that. My function is described below:
CREATE OR REPLACE FUNCTION update_row()
RETURNS void AS
$BODY$
declare
myUID integer;
nRow integer;
maxUid integer;
BEGIN
nRow:=1000;
select max(uid_atm_inp) from tab into maxUid where field1 = '1240200';
loop
if (nRow > 1000 and nRow < maxUid) then
select uid from tab into myUID where field1 = '1240200' and uid >= nRow limit 1;
update tab
set field = 'xxx'
where field1 = '1240200' and uid = myUID;
nRow:=nRow+1000;
end if;
end loop;
END; $BODY$
LANGUAGE plpgsql VOLATILE
How can I improve this procedure? I think there is something wrong. The loop does not end and takes too much time.
To perform this task in SQL, you could use the row_number window function and update only those rows where the number is divisible by 1000.
Your loop doesn't finish because there is no EXIT or RETURN in it.
I doubt you could ever rival the performance of a standard SQL update with a procedural loop. Instead of doing it a row at a time, just do it all as a single statement:
with t2 as (
select
uid, row_number() over (order by 1) as rn
from tab
where field1 = '1240200'
)
update tab t1
set field = 'xxx'
from t2
where
t1.uid = t2.uid and
mod (t2.rn, 1000) = 0
Per my comment, I am presupposing what you mean by "every 1000th row," as without some designation of how to determine what tuple is what row number. That is easily edited by changing the "order by" criteria.
Adding a second where clause on the update (t1.field1 = '1240200') can't hurt but might not be necessary if these are nested loop.
This might be notionally similar to what Laurenz has in mind.
I solved this way:
declare
myUID integer;
nRow integer;
rowNum integer;
checkrow integer;
myString varchar(272);
cur_check_row cursor for select uid , row_number() over (order by 1) as rn, substr(fieldxx,1,244)
from table where field1 = '1240200' and uid >= 1000 ORDER BY uid;
BEGIN
open cur_check_row;
loop
fetch cur_check_row into myUID, rowNum, myString;
EXIT WHEN NOT FOUND;
select mod(rowNum, 1000) into checkrow;
if checkrow = 0 then
update table
set fieldxx= myString||'O'
where uid in (myUID);
end if;
end loop;
close cur_check_row;
I'm trying the simulating the "nextval" function, i need the next value generated based on specific id. In some cases, the function is returned the same value.
CREATE OR REPLACE FUNCTION public.nextvalue(character,integer)
RETURNS integer AS
$BODY$
DECLARE
p_id ALIAS FOR $1;
p_numero integer;
BEGIN
p_numero = (SELECT numero FROM "TnextValue" WHERE id = p_id FOR UPDATE);
IF p_numero is null THEN
INSERT INTO "TnextValue" (numero,id) VALUES (1,p_id);
p_numero = 1;
END IF;
UPDATE "TnextValue" SET numero = p_numero + 1 where id = p_id;
RETURN p_numero;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
I tried add the statement FOR UPDATE, but the problem persist. I thinking the add one line above the statament (SELECT) the line
LOCK TABLE "TnextValue" IN ROW EXCLUSIVE MODE;
But i think this line block the table for others id obtain the next value in same time.
Thanks!
IF p_numero is null THEN
INSERT INTO "TnextValue" (numero,id) VALUES (1,p_id);
p_numero = 1;
END IF;
not going to work there's no locking on that row if two callers want the same new sequence one of them will get an error.
IF p_numero is null THEN
BEGIN
INSERT INTO "TnextValue" (numero,id) VALUES (1,p_id);
p_numero = 1;
EXCEPTION WHEN UNIQUE_VIOLATION THEN
RETURN public.nextvalue($1,$2);
END;
END IF;
My Check constraint is as follows:
ALTER TABLE tablename
ADD CONSTRAINT check_duplicate_rows
CHECK (reject_duplicate_rows(columnB, columnC, columnD) < 2);
I want the constraint to be evaluated only when you insert a record.
Currently it does for both the insert and update statements, The problem is that my system needs to update the inserted rows and the check constraint blocks the updates.
The reject_duplicate_rows function is as follows:
CREATE OR REPLACE FUNCTION reject_duplicate_rows(columnB integer, columnC integer, columnD integer)
RETURNS integer AS
$BODY$
DECLARE
results INTEGER := 1;
v_count INTEGER := 0;
BEGIN
IF columnC <> 23 THEN
RETURN results;
END IF;
SELECT total INTO v_count FROM
(SELECT columnB,
columnC,
columnD,
count(*) AS total
FROM table_name
WHERE B = columnB AND C = columnC AND D = columnD
GROUP BY 1, 2, 3)
as temp_table;
IF COALESCE(v_count, 0) = 0 THEN
RETURN results;
END IF;
IF v_count >= 1 THEN
results := 2;
END IF;
RETURN results;
EXCEPTION
WHEN OTHERS THEN
RETURN results;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 1000;
ALTER FUNCTION reject_duplicate_rows(integer, integer, integer)
OWNER TO postgres
Have you tried to create an UPDATE trigger? see Creating postgresql trigger
everybody using mysql knows:
SELECT SQL_CALC_FOUND_ROWS ..... FROM table WHERE ... LIMIT 5, 10;
and right after run this :
SELECT FOUND_ROWS();
how do i do this in postrgesql? so far, i found only ways where i have to send the query twice...
No, there is not (at least not as of July 2007). I'm afraid you'll have to resort to:
BEGIN ISOLATION LEVEL SERIALIZABLE;
SELECT id, username, title, date FROM posts ORDER BY date DESC LIMIT 20;
SELECT count(id, username, title, date) AS total FROM posts;
END;
The isolation level needs to be SERIALIZABLE to ensure that the query does not see concurrent updates between the SELECT statements.
Another option you have, though, is to use a trigger to count rows as they're INSERTed or DELETEd. Suppose you have the following table:
CREATE TABLE posts (
id SERIAL PRIMARY KEY,
poster TEXT,
title TEXT,
time TIMESTAMPTZ DEFAULT now()
);
INSERT INTO posts (poster, title) VALUES ('Alice', 'Post 1');
INSERT INTO posts (poster, title) VALUES ('Bob', 'Post 2');
INSERT INTO posts (poster, title) VALUES ('Charlie', 'Post 3');
Then, perform the following to create a table called post_count that contains a running count of the number of rows in posts:
-- Don't let any new posts be added while we're setting up the counter.
BEGIN;
LOCK TABLE posts;
-- Create and initialize our post_count table.
SELECT count(*) INTO TABLE post_count FROM posts;
-- Create the trigger function.
CREATE FUNCTION post_added_or_removed() RETURNS TRIGGER AS $$
BEGIN
IF TG_OP = 'DELETE' THEN
UPDATE post_count SET count = count - 1;
ELSIF TG_OP = 'INSERT' THEN
UPDATE post_count SET count = count + 1;
END IF;
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
-- Call the trigger function any time a row is inserted.
CREATE TRIGGER post_added_or_removed_tgr
AFTER INSERT OR DELETE
ON posts
FOR EACH ROW
EXECUTE PROCEDURE post_added_or_removed();
COMMIT;
Note that this maintains a running count of all of the rows in posts. To keep a running count of certain rows, you'll have to tweak it:
SELECT count(*) INTO TABLE post_count FROM posts WHERE poster <> 'Bob';
CREATE FUNCTION post_added_or_removed() RETURNS TRIGGER AS $$
BEGIN
-- The IF statements are nested because OR does not short circuit.
IF TG_OP = 'DELETE' THEN
IF OLD.poster <> 'Bob' THEN
UPDATE post_count SET count = count - 1;
END IF;
ELSIF TG_OP = 'INSERT' THEN
IF NEW.poster <> 'Bob' THEN
UPDATE post_count SET count = count + 1;
END IF;
END IF;
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
There is a simple way, but keep in mind, that following COUNT(*) aggr function will be applied to all rows returned after where and before limit/offset (may be costy)
SELECT
id,
"count" (*) OVER () AS cnt
FROM
objects
WHERE
id > 2
OFFSET 50
LIMIT 5
No, PostgreSQL doesn't try to count all relevant results when you only need 10 results. You need a seperate COUNT to count all results.