What is the equivalent of a static cursor in postgres - postgresql

A STATIC CURSOR in MSSQL will create a temporary table with the selected data and use that.
Is there an equivalent way to do this in postgres? What happens when a row you've selected in the cursor gets deleted in the underlying table prior to the cursor reaching it?

Related

Ambiguous column reference "ctid" in SELECT with more than one table

I'm using CRecordset to query one table, but I use a second table to filter data. If in my GetDefaultSQL override method I return a table list with more than one table then I get this ERROR: column reference "ctid" is ambiguous. I know what a "ctid" column is, but I don't use it in my code. It's inserted into the original SQL statement by ODBC driver. How to fix this? How to tell the ODBC driver not to insert the "ctid" column?
I tried to call CRecordset::Open with readOnly parameter, as I assume that ODBC needs ctid to update the row, and I don't need to update them. But the error remains.
Also tried to add a primary key to the second table that was missing it, thinking if a table has a primary key then ODBC can use that instead of 'ctid', but again no luck. Makes sense though, because I don't fetch any column of that second table, and the second table is used just for filtering.
If I make a DB view to work around the issue, I get ERROR: column "ctid" does not exist.
You have to call CRecordset::Open with two parameters changed:
m_pSet->Open(CRecordset::snapshot, NULL, CRecordset::readOnly);
Then you can fetch both the joined tables and the view without errors. No "ctid" then.

Can "Insert Trigger For Each Row After Each Statement" use index with the newly added values?

I am using Postgres 9.3.
I just added a trigger to a table.
It is an after insert trigger which is executed for each row after each statement.
I coded the trigger function assuming the index of the same table contains the newly added rows.
If this is not true, mass inserts will slow down significantly.
I google it a bit but couldn't find an answer.
So, to sum up my questions is after a statement, is index updated before or after the "after insert trigger for each statement" in Postgres 9.3?
Here is the trigger definition I've used:
CREATE TRIGGER trigger_name
AFTER INSERT OR UPDATE
ON table_name
FOR EACH STATEMENT
EXECUTE PROCEDURE trigger_funtion();
An AFTER trigger FOR EACH ROW will see that row in the table. For that to happen reliably the row must have already been added to any indexes. So the index has been updated.
However, if you attempt to modify the table that caused the AFTER trigger to be fired within the AFTER trigger, this usually results in an infinite loop and an error. It is rarely the correct thing to do.
Usually when you're trying to do that, you actually want a BEFORE trigger that modifies the row before it is saved.
If you need to modify some other row in the same table, that often suggests a data model problem. You should very rarely, if ever, need to modify one row in a table using a trigger when a different row is modified.

Implications of using ADD COLUMN on large dataset

Docs for Redshift say:
ALTER TABLE locks the table for reads and writes until the operation completes.
My question is:
Say I have a table with 500 million rows and I want to add a column. This sounds like a heavy operation that could lock the table for a long time - yes? Or is it actually a quick operation since Redshift is a columnar db? Or it depends if column is nullable / has default value?
I find that adding (and dropping) columns is a very fast operation even on tables with many billions of rows, regardless of whether there is a default value or it's just NULL.
As you suggest, I believe this is a feature of the it being a columnar database so the rest of the table is undisturbed. It simply creates empty (or nearly empty) column blocks for the new column on each node.
I added an integer column with a default to a table of around 65M rows in Redshift recently and it took about a second to process. This was on a dw2.large (SSD type) single node cluster.
Just remember you can only add a column to the end (right) of the table, you have to use temporary tables etc if you want to insert a column somewhere in the middle.
Personally I have seen rebuilding the table works best.
I do it in following ways
Create a new table N_OLD_TABLE table
Define the datatype/compression encoding in the new table
Insert data into N_OLD(old_columns) select(old_columns) from old_table Rename OLD_Table to OLD_TABLE_BKP
Rename N_OLD_TABLE to OLD_TABLE
This is a much faster process. Doesn't block any table and you always have a backup of old table incase anything goes wrong

ADD COLUMN with DEFAULT value to a huge table

I have a postgresql DB and a table with almost billion of rows.
when I try to add a new column with default value:
ALTER TABLE big_table
ADD COLUMN some_flag integer NOT NULL DEFAULT 0;
The transaction goes on for 30+ min .. and the DB logs starts to shoots warnings.
Any way to optimize the query ?
Besides doing it in batches (which will still take a while):
You could dump the table as COPY statements and write a script to edit the contents of the COPY statements to insert another column (COPY can be CSV IIRC).
Then you just reload your altered COPY dump and it should in theory be faster than the ALTER because COPY will not log transactions.
The other option is to turn off fsync while you run the command... just remember to turn it back on.
You can also do both of the above in batches.
Starting from PostgreSQL 11 this behaviour will change.
Waiting for PostgreSQL 11 – Fast ALTER TABLE ADD COLUMN with a non-NULL default:
So, for the longest time, when you did:
alter table x add column z text;
it was virtually instantaneous. Get a lock on table, add information about new column to system catalogs, and it's done.
But when you tried:
alter table x add column z text default 'some value';
then it took long time. How long it did depend on size of table.
This was because postgresql was actually rewriting the whole table, adding the column to each row, and filling it with default value.
"What happens if you want to set the column to NOT NULL also? Are we back to the slow version in that case or does this handle that as well?"
not null doesn’t change anything. it is a constraint for new rows. so adding a column with “not null default ‘xxx'” will be fast.
I'd consider creating the column without the default and manually updating the rows in batches with intermittent commits to apply the default.

PostgreSQL v7.4 ALTER TABLE to change column

I have a need to change the length of CHAR columns in tables in a PostgreSQL v7.4 database. This version did not support the ability to directly change the column type or size using the ALTER TABLE statement. So, directly altering a column from a CHAR(10) to CHAR(20) for instance isn't possible (yeah, I know, "use varchars", but that's not an option in my current circumstance). Anyone have any advice/tricks on how to best accomplish this? My initial thoughts:
-- Save the table's data in a new "save" table.
CREATE TABLE save_data AS SELECT * FROM table_to_change;
-- Drop the columns from the first column to be changed on down.
ALTER TABLE table_to_change DROP column_name1; -- for each column starting with the first one that needs to be modified
ALTER TABLE table_to_change DROP column_name2;
...
-- Add the columns back, using the new size for the CHAR column
ALTER TABLE table_to_change ADD column_name1 CHAR(new_size); -- for each column dropped above
ALTER TABLE table_to_change ADD column_name2...
-- Copy the data bace from the "save" table
UPDATE table_to_change
SET column_name1=save_data.column_name1, -- for each column dropped/readded above
column_name2=save_date.column_name2,
...
FROM save_data
WHERE table_to_change.primary_key=save_data.primay_key;
Yuck! Hopefully there's a better way? Any suggestions appreciated. Thanks!
Not PostgreSQL, but in Oracle I have changed a column's type by:
Add a new column with a temporary name (ie: TMP_COL) and the new data type (ie: CHAR(20))
run an update query: UPDATE TBL SET TMP_COL = OLD_COL;
Drop OLD_COL
Rename TMP_COL to OLD_COL
I would dump the table contents to a flat file with COPY, drop the table, recreate it with the correct column setup, and then reload (with COPY again).
http://www.postgresql.org/docs/7.4/static/sql-copy.html
Is it acceptable to have downtime while performing this operation? Obviously what I've just described requires making the table unusable for a period of time, how long depends on the data size and hardware you're working with.
Edit: But COPY is quite a bit faster than INSERTs and UPDATEs. According to the docs you can make it even faster by using BINARY mode. BINARY makes it less compatible with other PGSQL installs but you won't care about that because you only want to load the data to the same instance that you dumped it from.
The best approach to your problem is to upgrade pg to something less archaic :)
Seriously. 7.4 is going to be removed from "supported versions" pretty soon, so I wouldn't wait for it to happen with 7.4 in production.