Can we setup a table in postgres to always view the latest inserts first? - postgresql

Right now when I create a table and do a
select * from table
I always see the first insert rows first. I'd like to have my latest inserts displayed first. Is it possible to achieve with minimal performance impact?

I believe that Postgres uses an internal field called OID that can be sorted by. Try the following.
select *,OID from table order by OID desc;
There are some limitations to this approach as described in SQL, Postgres OIDs, What are they and why are they useful?
Apparently the OID sequence "does" wrap if it exceeds 4B 6. So in essence it's a global counter that can wrap. If it does wrap, some slowdown may start occurring when it's used and "searched" for unique values, etc.
See also https://wiki.postgresql.org/wiki/FAQ#What_is_an_OID.3F
NB - in more recent version of Postgres this could be deprecated ( https://www.postgresql.org/docs/8.4/static/runtime-config-compatible.html#GUC-DEFAULT-WITH-OIDS )
Although you should be able to create tables with OID even in the most recent version if done explicitly on table create as per https://www.postgresql.org/docs/9.5/static/sql-createtable.html
Although the behaviour you are observing in the CLI appears consistent, it isn't a standard and cannot be depended on. If you are regularly needing to manually see the most recently added rows on a specific table you could add a timestamp field or some other sortable field and perhaps even wrap the query into a stored function .. I guess the approach depends on your particular use case.

Related

How to add a column to a table on production PostgreSQL with zero downtime?

Here
https://stackoverflow.com/a/53016193/10894456
is an answer provided for Oracle 11g,
My question is the same:
What is the best approach to add a not null column with default value
in production oracle database when that table contain one million
records and it is live. Does it create any locks if we do the column
creation , adding default value and making it as not null in a single
statement?
but for PostgreSQL ?
This prior answer essentially answers your query.
Cross referencing the relevant PostgreSQL doc with the PostgreSQL sourcecode for AlterTableGetLockLevel mentioned in the above answer shows that ALTER TABLE ... ADD COLUMN will always obtain an ACCESS EXCLUSIVE table lock, precluding any other transaction from accessing the table for the duration of the ADD COLUMN operation.
This same exclusive lock is obtained for any ADD COLUMN variation; ie. it doesn't matter whether you add a NULL column (with or without DEFAULT) or have a NOT NULL with a default.
However, as mentioned in the linked answer above, adding a NULL column with no DEFAULT should be very quick as this operation simply updates the catalog.
In contrast, adding a column with a DEFAULT specifier necessitates a rewrite the entire table in PostgreSQL 10 or less.
This operation is likely to take a considerable time on your 1M record table.
According to the linked answer, PostgreSQL >= 11 does not require such a rewrite for adding such a column, so should perform more similarly to the no-DEFAULT case.
I should add that for PostgreSQL 11 and above, the ALTER TABLE docs note that table rewrites are only avoided for non-volatile DEFAULT specifiers:
When a column is added with ADD COLUMN and a non-volatile DEFAULT is specified, the default is evaluated at the time of the statement and the result stored in the table's metadata. That value will be used for the column for all existing rows. If no DEFAULT is specified, NULL is used. In neither case is a rewrite of the table required.
Adding a column with a volatile DEFAULT [...] will require the entire table and its indexes to be rewritten. [...] Table and/or index rebuilds may take a significant amount of time for a large table; and will temporarily require as much as double the disk space.

Alter a SELECT Query to Limit

I'm working on a 1M+ row table. The software that inserts the data sometimes tries to select all rows. If it tries to do that; It crashes.
I'm not able to modify the software so I'm trying to implement a fix on the Postgresql side.
I want Postgresql to limit SELECT query results that are coming from a special user to 1.
I tried to implement a RULE but haven't been able to do it with success. Any suggestions are welcome.
Br,
You could rename the table and create a view with the name of the table (selecting from the renamed table).
Then you can include a LIMIT clause in the view definition.
There is a chance you need an index. Let me give you a few scenarios
There is a unique constraint on one of the fields but no corresponding index. This way when you insert a record PostgreSQL has to scan the table to see if there is an existing record with the same value in that field.
Your software mimics unique field constraint. Before inserting a new record it scans the table for a record with the same value in one of the fields to check if such a record already exists. Index on the right field would definitely help.
You software wants to compute the next "id" value. In this case it runs SELECT MAX(id) in order to find the next available value. "id" needs an index.
Try to find out if indexing one of the table fields helps. You can also try to trace and analyze queries submitted to the server and see if those queries can benefit from indexing the table. You can enable query logging this way How to log PostgreSQL queries?
Another guess is that your software buffers all records before processing them. Reading 1M records into memory may crash it. Limiting fetchSize (e.g. if your software uses JDBC you could add defaultRowFetchSize connection parameter to the connection string) may help though I realize you may not have means to change the way the existing software fetches data from DB.

Does adding a null column to a postgres table cause a lock?

I think I read somewhere that running an ALTER TABLE foo ADD COLUMN baz text on a postgres database will not cause a read or write lock. Setting a default value causes locking, but allowing a null default prevents a lock.
I can't find this in the documentation, though. Can anyone point to a place that says, definitively, if this is true or not?
The different sorts of locks and when they're used are mentioned in the doc in
Table-level Locks. For instance, Postgres 11's ALTER TABLE may acquire a SHARE UPDATE EXCLUSIVE, SHARE ROW EXCLUSIVE, or ACCESS EXCLUSIVE lock.
Postgres 9.1 through 9.3 claimed to support two of the above three but actually forced Access Exclusive for all variants of this command. This limitation was lifted in Postgres 9.4 but ADD COLUMN remains at ACCESS EXCLUSIVE by design.
It's easy to check in the source code because there's a function dedicated to establishing the lock level needed for this command in various cases: AlterTableGetLockLevel in src/backend/commands/tablecmds.c.
Concerning how much time the lock is held, once acquired:
When the column's default value is NULL, the column's addition should be very quick because it doesn't need a table rewrite: it's only an update in the catalog.
When the column has a non-NULL default value, it depends on PostgreSQL version: with version 11 or newer, there is no immediate rewriting of all the rows, so it should be as fast as the NULL case. But with version 10 or older, the table is entirely rewritten, so it may be quite expensive depending on the table's size.
Adding new null column will lock the table for very very short time since no need to rewrite all data on disk. While adding column with default value requires PostgreSQL to make new versions of all rows and store them on the disk. And during that time table will be locked.
So when you need to add column with default value to big table it's recommended to add null value first and then update all rows in small portions. This way you'll avoid high load on disk and allow autovacuum to do it's job so you'll not end up doubling table size.
http://www.postgresql.org/docs/current/static/sql-altertable.html#AEN57290
"Adding a column with a non-null default or changing the type of an existing column will require the entire table and indexes to be rewritten."
So the documentation only specifies when the table is not rewritten.
There will always be a lock, but it will be very short in case the table is not to be rewritten.

Query rows by time of creation?

I have a table that contains no date or time related fields. Still I want to query that table based on when records/rows were created. Is there a way to do this in PostgreSQL?
I prefer an answer about doing it in PostgreSQL directly. But if that's not possible, can hibernate do it for PostgreSQL?
Basically: no. There is no automatic timestamp for rows in PostgreSQL.
I usually add a column like this to my tables (ignoring time zones):
ALTER TABLE tbl ADD COLUMN log_in timestamp DEFAULT localtimestamp NOT NULL;
As long as you don't manipulate the values in that column, you got your creation timestamp. You can add a trigger and / or restrict write privileges to avoid tempering with the values.
Second class options
If you have a serial column, you could at least tell with some probability in what order rows were entered. That's not 100% reliable, because the values can be changed by hand, and applications can get values from the sequence and INSERT out of order.
If you created your table WITH (OIDS=TRUE), then the OID column could be some indication - unless your database is heavily written and / or very old, then you may have gone through OID wrap-around and later rows can have a smaller OID. That's one of the reasons, why this feature is hardly used any more.
The default depends on the setting of default_with_oids I quote the manual:
The parameter is off by default; in PostgreSQL 8.0 and earlier, it was
on by default.
If you have not updated your rows or went through a dump / restore cycle, or ran VACUUM FULL or CLUSTER or .. , a plain SELECT * FROM tbl returns all rows in the order they were entered. But this is very unreliable and implementation-dependent. PostgreSQL (like any RDBMS) does not guarantee any order without an ORDER BY clause.

PostgreSQL v7.4 ALTER TABLE to change column

I have a need to change the length of CHAR columns in tables in a PostgreSQL v7.4 database. This version did not support the ability to directly change the column type or size using the ALTER TABLE statement. So, directly altering a column from a CHAR(10) to CHAR(20) for instance isn't possible (yeah, I know, "use varchars", but that's not an option in my current circumstance). Anyone have any advice/tricks on how to best accomplish this? My initial thoughts:
-- Save the table's data in a new "save" table.
CREATE TABLE save_data AS SELECT * FROM table_to_change;
-- Drop the columns from the first column to be changed on down.
ALTER TABLE table_to_change DROP column_name1; -- for each column starting with the first one that needs to be modified
ALTER TABLE table_to_change DROP column_name2;
...
-- Add the columns back, using the new size for the CHAR column
ALTER TABLE table_to_change ADD column_name1 CHAR(new_size); -- for each column dropped above
ALTER TABLE table_to_change ADD column_name2...
-- Copy the data bace from the "save" table
UPDATE table_to_change
SET column_name1=save_data.column_name1, -- for each column dropped/readded above
column_name2=save_date.column_name2,
...
FROM save_data
WHERE table_to_change.primary_key=save_data.primay_key;
Yuck! Hopefully there's a better way? Any suggestions appreciated. Thanks!
Not PostgreSQL, but in Oracle I have changed a column's type by:
Add a new column with a temporary name (ie: TMP_COL) and the new data type (ie: CHAR(20))
run an update query: UPDATE TBL SET TMP_COL = OLD_COL;
Drop OLD_COL
Rename TMP_COL to OLD_COL
I would dump the table contents to a flat file with COPY, drop the table, recreate it with the correct column setup, and then reload (with COPY again).
http://www.postgresql.org/docs/7.4/static/sql-copy.html
Is it acceptable to have downtime while performing this operation? Obviously what I've just described requires making the table unusable for a period of time, how long depends on the data size and hardware you're working with.
Edit: But COPY is quite a bit faster than INSERTs and UPDATEs. According to the docs you can make it even faster by using BINARY mode. BINARY makes it less compatible with other PGSQL installs but you won't care about that because you only want to load the data to the same instance that you dumped it from.
The best approach to your problem is to upgrade pg to something less archaic :)
Seriously. 7.4 is going to be removed from "supported versions" pretty soon, so I wouldn't wait for it to happen with 7.4 in production.