tracing the cause of "could not open relation with OID" error - postgresql

Quite recently my PostgreSQL 8.2.4 logs such errors:
ERROR: could not open relation with OID nnnnnnnnn
CONTEXT: SELECT a,b,c FROM table_C
The error is always caused by the same scenario: an update to table A causes trigger to fire, which inserts data to table B, which fires another trigger, which (among many other things) does select on table C. That select on table C is then reported as CONTEXT of the problem above. The sequence of queries that cause the error message to appear is executed every day, and everyday it complains about the same OID missing.
Quite naturally the OID mentioned in error message doesn't exists when querying pg_class. Executing problematic SQL (that is, select on table C) doesn't cause any problems. I've tried to figure out the OIDs and connections between all the tables involved, to figure out where that reference to non-existent OID is, but i have failed. I started with table A and got its OID (pg_class.reltype) and verified, that it has trigger attached. The problems start when i query pg_trigger using pg_trigger.tgrelid = pg_class.reltype as a condition. The query yelds 0 rows, but when i query tables just by relname/tgname i get different OIDs, just like the trigger is on a different table. I did a quick test and it appears, that creating a simple table with a trigger on it produces the same result.
So my questions are:
How do i navigate pg_trigger (and other pg tables like pg_attribute, pg_shdepend) tables when i can locate table in pg_class?
If somehow I manage to find a reference to problematic OID, am I safe to simple remove the reference by doing direct updates/deletes on pg_class tables?

Note that 'reltype' is the OID of the table's rowtype- the OID of the table itself is pg_class.oid (which is a system column, so doesn't show up in \d or select * output, you need to select it explicitly).
Hopefully that will solve some mysteries of how the catalogue tables relate to each other! The same pattern is repeated with quite a few other tables using oid as their primary key.
It looks like quite a serious problem, possibly indicating some sort of catalogue corruption? You can modify pg_class et al directly, but obviously there is some risk involved in doing that. I can't think of much generic advice to give here- what to do will vary greatly depending on what you find.

If this appears while executing statements inside SQL function, then change the language from SQL to plpgsql. The cause can be a cached plan. plpgsql function invalidates plan between runs, while sql function seems to skip this step.

Verify your tables at backend, PostgreSQL assigns an unique OID for each table created.
Try to insert data at data base side then do it through application.
I was facing the same issue and struggled for long.

Related

Postgresql Corrupted pg_catalog table

I've been running a postgres database on an external hard drive and it appears it got corrupted after reconnecting it to a sleeping laptop that THOUGHT the server was still running. After running a bunch of reindex commands to fix some other errors I'm now getting the below error.
ERROR: missing chunk number 0 for toast value 12942 in pg_toast_2618
An example of a command that returns this error is:
select table_name, view_definition from INFORMATION_SCHEMA.views;
I've run the command "select 2618::regclass;" that gives you the problem table. However reindexing this doesn't seem to solve the problem. I see a lot of suggestions out there about finding the corrupted row and deleting it. However, the table that appears to have corruption in my instance is pg_rewrite and it appears to NOT be a corrupted row but a corrupted COLUMN.
I've run the following commands, but they aren't fixing the problem.
REINDEX table pg_toast.pg_toast_$$$$;
REINDEX table pg_catalog.pg_rewrite;
VACUUM ANALYZE pg_rewrite; -- just returns succeeded.
I can run the following SQL statement and it will return data.
SELECT oid, rulename, ev_class, ev_type, ev_enabled, is_instead, ev_qual FROM pg_rewrite;
However, if I add the ev_action column to the above query it throws a similar error of:
ERROR: missing chunk number 0 for toast value 11598 in pg_toast_2618
This error appears to affect all schema related queries to things like INFORMATION_SCHEMA tables. Luckily it seems as though all of my tables and data in my tables are fine but I cannot query the sql that generates those tables and any views I have created seem inaccessible (although I've noticed I can create new views).
I'm not familiar enough with Postgresql to know exactly what pg_rewrite is, but I'm guessing I can't just truncate the data in the table or set ev_action = null.
I'm not sure what to do next with the information I've gathered so far.
(At least) your pg_rewrite catalog has data corruption. This table contains the definition of all views, including system views that are necessary for the system to work.
The best thing to do is to restore a backup.
You won't be able to get the database back to work, the best you can do is to salvage as much of the data as you can.
Try a pg_dump. I don't know off-hand if that needs any views, but if it works, that is good. You will have to explicitly exclude all views from the dump, else it will most likely fail.
If that doesn't work, try to use COPY for each table to get at least the data out. The metadata will be more difficult.
If this is an important database, hire an expert.

PostgreSQL shows wrong sequence details in the table's information schema

Recently I saw a strange scenario with my PostgreSQL DB. The information schema of my database is showing a different sequence name than the one actually allocated for the column of my table.
The issue is:
I have a table tab_1
id name
1 emp1
2 emp2
3 emp3
Previously the id column (integer) of the table was an auto generated field where the sequence number was generated at run time via JPA. (Sequence name: tab_1_seq)
We made a change and updated the table's column id to bigserial and the sequence is maintained in the column level (allocated new sequence: tab_1_temp_seq) not handled by the JPA anymore.
After this change everything was working fine for few months and after that we faced an error - "the sequence "tab_1_temp_seq" is not yet defined in this session"
On analyzing the issue I found out that there is a mismatch between the sequences allocated for the table.
In the table structure, we where shown the sequence as tab_1_temp_seq and in the information_schema the table was allocated with the old sequence - tab_1_seq.
I am not sure what has really triggered this to happen, as we are not managing our database system. If you have faced any issues like this, kindly let me know its root cause.
Queries:
SELECT table_name, column_name, column_default from information_schema.columns where table_name = ‘tab_1’;
result :
table_name column_name column_default
tab_1 id nextval('tab_1_seq::regclass')
Below are the details found in the table structure/properties:
id nextval('tab_1_temp_seq::regclass')
name varChar
Perhaps you are suffering from data corruption, but it is most likely that you are suffering from bad tools to visualize your database objects. Whatever program shows you the “table structure/properties” might be confused.
To find out the truth (which DEFAULT value PostgreSQL uses), run:
SELECT pg_get_expr(adbin, adrelid)
FROM pg_attrdef
WHERE adrelid = 'tab1'::regclass;
This is also what information_schema.columns will show, but I added the naked query for clarity.
This DEFAULT value will be used whenever the INSERT statement either doesn't specify the id column or fills it with the special value DEFAULT.
Perhaps the confusion is also caused by different programs that may set default values in their way. The way I showed above is PostgreSQL's way, but nothing can keep a third-party tool from using its own sequence to filling the id.

PostgreSQL select query on table that is being updated

I assume this question has been asked before, but unfortunately I cannot find the answer to my question.
I have a table, and I am using an update statement to update a column. Simultaneously I am running a create table query with a select statement that is retrieving data from the table and column that is also being updated.
My questions are: can this lead to wrong results in the output of the create table statement? does the update query finish 1st then the create table with the select execute? I just know that the create table statement is taking way longer to execute.
In PostgreSQL readers never lock writers and vice versa. This is guaranteed by PostgreSQL's MVCC implementation that keeps old row versions around.
If the updating transaction isn't finished yet, the reading transaction will see the old value, and the result is consistent.
There is nothing inside PostgreSQL that should slow down the SELECT statement noticeably, but of course I/O contention is a possible explanation.

Select from table identified by OID

Is it possible to select from a table I know the OID of? Something like
select * from 123456::regclass
I understand I can do it in a function, by constructing a dynamic query, but it seems strange I cannot do it directly -- I would assume Postgresql translates all my table names to OIDs the moment it starts analyzing my query, so why cannot I save it the trouble?
No.
OID is a global counter which can wrap around. Meaning there can be multiple tables containing objects of same OID. So even if there was some hack around, it would error prone in the least

How to find the OID of the rows in a table in postgres?

I have a problem encountered lately in our Postgres database, when I query: select * from myTable,
it results to, 'could not open relation with OID 892600370'. And it's front end application can't run properly anymore. Base on my research, I determined the column that has an error but I want exactly to locate the rows OID of the column so that I can modify it. Please help.
Thank you in advance.
You've got a corrupted database. Might be a bug, but more likely bad hardware. If you have a recent backup, just use that. I'm guessing you don't though.
Make sure you locate any backups of either the database or its file tree and keep them safe.
Stop the PostgreSQL server and take a file backup of the entire database tree (base, global, pg_xlog - everything at that level). It is now safe to start fiddling...
Now, start the database server again and dump tables one at a time. If a table won't dump, try dropping any indexes and foreign-key constraints and give it another go.
For a table that won't dump, it might be just certain rows. Drop any indexes and dump a range of rows using COPY ... SELECT. That should let you narrow down any corrupted rows and get the rest.
Now you have a mostly-recovered database, restore it on another machine and take whatever steps are needed to establish what is damaged/lost and what needs to be done.
Run a full set of tests on the old machine and see if anything needs replacement. Consider whether your monitoring needs improvement.
Then - make sure you keep proper backups next time, that way you won't have to do all this, you'll just use them instead.
could not open relation with OID 892600370
A relation is a table or index. A relation's OID is the OID of the row in pg_class where this relation is defined.
Try select relname from pg_class where oid=892600370;
Often it's immediately obvious from relname what this relation is, otherwise you want to look at the other fields in pg_class: relnamespace, relkind,...