I am writing a Sybase stored procedure with a cursor fetching records from table and inserting records back. Surprisingly I found that fetches see amd return records inserted in the same loop. Under some conditions this results in endless loop, sometimes the loop ends inseting a number of extra records. Looking through Sybase documentation I could not find a cure. Transaction isolation does not help since we are acting inside a single transaction. Of course I could solve the problem by inserting into a temporary table in the loop and then inserting back into main table after the loop ends.
But the question remain: how can I isolate cursor fetches from inserts into the same table?
Pseudocode follows:
create table t (v int)
go
insert into t(v) values (1)
create procedure p as
begin
declare #v int
declare c cursor for select v from t
begin transaction
open c
while 1=1 begin
fetch c into #v
if (##sqlstatus != 0) break
insert into t (v) values (#v+1)
end
close c
commit
end
go
exec p
go
I was surprised by this behavior, too. Cursors only iterate over rows, they are not immune to changes during the loop. The manual states this:
A searched or positioned update on an allpages-locked table can change
the location of the row; for example, if it updates key columns of a
clustered index. The cursor does not track the row; it remains
positioned just before the next row at the original location.
Positioned updates are not allowed until a subsequent fetch returns
the next row. The updated row may be visible to the cursor a second
time, if the row moves to a later position in the search order
My solution was to insert into a temporary table and copy the results back at the end, this also sped up the process by a factor of about 10.
Pseudo-code:
select * into #results from OriginalTable where 1<>1 --< create temp table with same columns
WHILE fetch...
BEGIN
insert into #results
select -your-results-here
END
insert into OriginalTable select * from #results
Related
I have a million-row table in Postgres 13 that needs a one-time update of each row: the (golang) script will read the current column value for each row, transform it, then update the row with the new value, for example:
DECLARE c1 CURSOR FOR SELECT v FROM users;
FETCH c1;
-- read and transform v
UPDATE users SET v = ? WHERE CURRENT OF c1;
-- transaction committed
FETCH c1;
...
I'm familiar with cursors for reading, but have a few requirements for writing that I'm struggling to find the right settings for:
I don't want it all to run in a single huge transaction, which is the default with cursors, since the change set will be large and it will take a while. I'd rather each update be its own transaction, and I can re-run the idempotent script again if it fails for any reason. I'm aware of DECLARE WITH HOLD to have the cursor span transactions, but...
By default the data read by the cursor is "insensitive" (a snapshot from when the cursor was first created), but I would like the latest data for each row with FETCH in case there has been a subsequent update. The solution to that is to use FOR UPDATE in the cursor query to make it "sensitive," but that is not allowed together with WITH HOLD. I would prefer the row lock you get with FOR UPDATE to prevent the read-then-write race condition between FETCH and UPDATE, but it's not mandatory
How can I iterate all rows and update them one at a time without having to read everything into memory first?
Make the cursor be WITH HOLD, but select the pk rather than v. Then in the loop, select the now-current v from the table based on the pk (rather than current of main), and update it using the pk.
I have a job which runs every night to load changes into a temporary table and apply those changes to the main table.
CREATE TEMP TABLE IF NOT EXIST tmp AS SELECT * FROM mytable LIMIT 0;
COPY tmp FROM PROGRAM '';
11 SQL queries to update 'mytable' based on data from 'tmp'
I have a large number of queries to delete duplicates from tmp, update values in tmp, update values in the main table and insert new rows into the main table. Is it possible to loop over both tables using plpgsql instead?
UPDATE mytable m
SET "Field" = t."Field" +1
FROM tmp t
WHERE (t."ID" = m."ID");
In this example, it is simple change of a column value. Instead, I want to do more complex operations on both the main table as well as the temp table.
EDIT: so here is some is some PSEUDO code of what I imagine.
LOOP tmp t, mytable m
BEGIN
-- operation in plpgsql including UPDATE, INSERT, DELETE
END
WHERE t.ID = m.ID;
You can use plpgsql FOR to loop over query results.
DECLARE
myrow RECORD;
BEGIN
FOR myrow IN SELECT * FROM table1 JOIN table2 USING (id)
LOOP
... do something with the row ...
END LOOP;
END
If you want to update a table while looping over it, you can create a FOR UPDATE cursor, but that won't work if the query is a join, because then you're not opening an update cursor on a table.
Note writing to/updating temp tables is much faster than writing to normal tables because temp tables don't have WAL and crash recovery overhead, and they're owned by one single connection, so you don't have to worry about locks.
If you put a query inside the loop, it will be executed many times though, which could get pretty slow. It's usually faster to use bulk queries, even if they're complicated.
If you want to UPDATE many rows in the temp table with values that depend on other tables and joins, it could be faster to run several updates on the temp table with different join and WHERE conditions.
I had accidentally deleted most of the rows in my Postgres table (data is not important its in my test environment, but I need a dummy data to be insert in to these table).
Let us take three tables,
MAIN_TABLE(main_table_id, main_fields)
ADDRESS_TABLE(address_table_id, main_table_id, address_type, other_fielsds)
CHAID_TABLE(chaid_table_id,main_table_id, shipping_address_id, chaild_fields)
I had accidentally deleted most of the data from ADDRESS_TABLE.
ADDRESS_TABLE has a foreign key from MAIN_TABLE ,i.e. main_table_id. for each row in MAIN_TABLE there is two entries in ADDRESS_TABLE, in which one entry is its address_type is "billing/default" and other entry is for address_type "shipping".
CHAID_TABLE has two foreign keys one from MAIN_TABLE, i.e. main_table_id and other from ADDRESS_TABLE i.e., shipping_address_id. this shipping_address_id is address id of ADDRESS_TABLE, its address_type is shipping and ADDRESS_TABLE.main_table_id = CHAID_TABLE.main_table_id.
These are the things that I needed.
I need to create two dummy address entries for each raw in MAIN_TABLE one is of address type "billing/default" and other is of type "shipping".
I need to insert address_table_id to the CHAID_TABLE whose ADDRESS_TABLE.main_table_id = CHAID_TABLE.main_table_id. and addres_type ="shipping"
if first is done I know how to insert second, because it is a simple insert query. I guess.
it can be done like,
UPDATE CHAID_TABLE
SET shipping_address_id = ADDRESS_TABLE.address_table_id
FROM ADDRESS_TABLE
WHERE ADDRESS_TABLE.main_table_id = CHAID_TABLE.main_table_id
AND ADDRESS_TABLE.addres_type ='shipping';
for doing first one i can use loop in psql, ie loop through all the entries in MAIN_TABLE and insert two dummy rows for each rows. But I don't know how to do these please help me to solve this.
I hope your solution is this, Create a function that loop through all the rows in MAIN_TABLE, inside the loop do the action you want, here two insert statement, one issue of this solution is you have same data in all address.
CREATE OR REPLACE FUNCTION get_all_MAIN_TABLE () RETURNS SETOF MAIN_TABLE AS
$BODY$
DECLARE
r MAIN_TABLE %rowtype;
BEGIN
FOR r IN
SELECT * FROM MAIN_TABLE
LOOP
-- can do some processing here
INSERT INTO ADDRESS_TABLE ( main_table_id, address_type, other_fielsds)
VALUES('shipping', r.main_table_id,'NAME','','other_fielsds');
INSERT INTO ADDRESS_TABLE ( main_table_id, address_type, other_fielsds)
VALUES('billing/default',r.main_table_id,'NAME','','other_fielsds');
END LOOP;
RETURN;
END
$BODY$
LANGUAGE plpgsql;
SELECT * FROM get_all_MAIN_TABLE ();
I have an INSERT trigger in PostgreSQL that I'm trying to have join the inserted row on another table, and then insert the result in a third table. Let's call the original table, that the INSERT trigger fires on, "A"; the table the cursor joins A on "B"; and the table the trigger function then inserts to "C".
My thinking was that an AFTER INSERT function should allow me to pass a value from the "NEW" row as a parameter in order to reference its corresponding row in Table A, like this:
myCursor CURSOR (insertedKey A.key%TYPE) FOR
SELECT *
FROM A
INNER JOIN B
ON A.key=B.key
WHERE A.key=insertedKey;
...
OPEN myCursor (NEW.key);
FETCH NEXT FROM myCursor INTO row_C;
INSERT INTO C VALUES (row_C.*);
This gives me an empty cursor. If I trigger the trigger on AFTER UPDATE, it works, but with the old row from A. This leads me to think that PostgreSQL doesn't think AFTER INSERT/UPDATE means what I think it means.
Or maybe I'm just doing something wrong? Is there any way of doing what I'm trying to do?
Not sure why it happens but you could do something along the line of
INSERT INTO C
SELECT NEW.*, B.*
FROM B
WHERE B.key = NEW.key
What would the processing load concern be if I had an "After Insert" trigger created on a table and in that trigger I performed a While loop to iterate through "potentially" multiple rows?
End result is I will 99.999% of the time have only 1 row, but as the future is unpredictable i also want to be able to handle multiple rows being inserted.
Trigger Model:
1) Insert information into the table
2) Create views specific to the client, via stored procedures (if possible)
What Say You? :)
Haven't fully developed but this is the design i am looking for, may not be structurally sound but should get the point acrossed.
CREATE TRIGGER dbo.New_Client_Setup
ON dbo.client
AFTER INSERT
AS
BEGIN
SET NOCOUNT ON;
--Fill Temp Table
select * into #clients
from inserted
--Iterate through Temp Table
While (select count(*) from #clients) <> 0 BEGIN
declare #id int, #clnt nvarchar(10)
select top(1)
#id = id
, #clnt = short
order by id desc
Execute dbo.sp_Create_View_Client ( #id, #clnt )
-- Drop used ID
delete from #clients
where id = #id
END
Drop table #clients
END
GO
Again, observe the design of the trigger not necessarily the syntactic sugar
Design wise, reading the comments, I think you do not neccesarily need to do this in triggers. I would say you should do it as part of your insert statement in transactions - i.e. do the insert, and then do the loop that you want to do (whatever that does - execute dbo.sp_Create_View_Client)...
The second thing I would mention is what exactly is dbo.sp_Create_View_Client doing - is it a must-dependent on the insert? Meaning, what happens if the insert works fine, and the trigger fails? I would maybe do the whole insert and execute of the SP all in one transaction, so as to preserve data integrity.