Where is oid in pg_tblspc error message - postgresql

Recently I got could not read block error by displaying the following message:
pg_tblspc/16010/PG_9.3_201306121/16301/689225.365
After this error, I am trying the below query by assuming few of the numbers as oid, but my query result is empty rows.
select oid,relname from pg_class where oid=16010 or oid=16301;
Now my question is, what are the numbers on that pg_tablspc? I have gone through the link and I believe I might have missed the main point from there too!

Update: much more detailed write-up at http://blog.2ndquadrant.com/postgresql-filename-to-table/
The following info doesn't consider relfilenode changes due to vacuum full etc.
In:
pg_tblspc/16010/PG_9.3_201306121/16301/689225.365
we have:
pg_tblspc: Indicates that it's a relation in a tablespace other than the default or global tablespaces
16010: the tablespace oid from pg_tablespace.oid,
PG_9.3_201306121: A version-specific, catversion-specific string to allow different Pg versions to co-exist in a tablespace,
16301: the database oid from pg_database.oid
689225: the relation oid from pg_class.oid
365: The segment number. PostgreSQL splits big tables up into extents (segments) of 1GB each.
There may also be a fork number, but there isn't one in this path.
It took a fair bit of source code digging for me to be sure about this. The macro you want is relpathbackend in src/include/common/relpath.h, for anyone else looking, and it calls GetRelationPath in src/common/relpath.c.

Related

How can I reproduce a database context to debug a tricky PostgreSQL error: "variable not found in subplan target list"

I am facing a tricky error with a PostgreSQL Database that suddenly popped up and I cannot reproduce elsewhere.
The error happened suddenly without any known maintenance or upgrade and seems to be related to a specific database context.
Documentation
The bug seems to go back and forth, here is a list of links found when searching over the web for the error message:
Februray 2015: How to fix "InternalError: variable not found in subplan target list"
October 2017: query error: variable not found in subplan target list (when using PG 9.6.2)
Feburary 2022: PGroonga index-only scan problem with yesterday’s PostgreSQL updates
June 2022 (my report): Sudden database error with COUNT(*) making Query Planner crashes: variable not found in subplan target list
The product version I have detected the error is:
SELECT version();
-- PostgreSQL 13.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
And the only extensions I have installed are:
SELECT extname, extversion FROM pg_extension;
-- "plpgsql" "1.0"
-- "postgis" "3.1.1"
Symptom
The error main symptom is variable not found in subplan target list:
SELECT COUNT(*) FROM items;
-- ERROR: variable not found in subplan target list
-- SQL state: XX000
And does not affect all tables, just some specific ones.
What is interesting is that it is only partially broken:
SELECT COUNT(id) FROM items; -- 213
SELECT COUNT(*) FROM items WHERE id > 0; -- 213
And it only affect the COUNT(*) aggregate most probably because of the * placeholder.
Further more the error is related to the query plan not to the query itself as:
EXPLAIN SELECT COUNT(*) FROM item;
-- ERROR: variable not found in subplan target list
-- SQL state: XX000
Fails as well without actually executing the query.
Digging into the PostgreSQL code on GitHub the error message appears here and is related to the function search_indexed_tlist_for_var in case it returns nothing.
This pointer should explain why it happens when using * placeholder instead of an explicit column name.
Reproducibility
It is a tricky bug, simply because showing it exists is difficult and the bug is somehow vicious as by now I cannot understand which conditions make it happen.
It seems this bug is raising in specific context (eg. bug with equivalent message and symptom reported with the PGroonga extension) but in my case I cannot make a parallel by now.
It is likely I am facing an equivalent problem in a different context but I could not succeed in capturing a simple MCVE to spot it.
CREATE TABLE t AS SELECT CAST(c AS text) FROM generate_series(1, 10000) AS c;
-- SELECT 10000
CREATE INDEX t_c ON t(c);
-- CREATE INDEX
VACUUM t;
-- VACUUM
SELECT COUNT(*) FROM t;
-- 10000
Works as expected. Table having the issue relies on postgis extension index, but again I cannot reproduce it:
CREATE EXTENSION postgis;
-- CREATE EXTENSION
CREATE TABLE test(
id serial,
geom geometry(Point, 4326)
);
-- CREATE TABLE
INSERT INTO test
SELECT x, ST_MakePoint(x/10000., x/10000.) FROM generate_series(1, 10000) AS x;
-- INSERT 0 10000
CREATE INDEX test_index ON test USING GIST(geom);
-- CREATE INDEX
VACUUM test;
-- VACUUM
SELECT COUNT(*) FROM test;
-- 10000
Works as expected.
And when I dump and restore the faulty database the problem vanishes.
Looking for a MCVE
When trying to reproduce the bug in order to build an MCVE and unit tests to highlight it and report it to developers I face a limitation. When dumping the database and recreating to a new instance, the bug simply vanishes.
So the unique way I can reproduce this bug is using the original database but I could not succeed to prepare a dump of the database to reproduce the bug elsewhere.
This what it is all about, I am looking for hints to reproduce the bug in my context.
At this point my analysis is:
The bug is related to the database state or to some meta data that is not equally the same when the dump is restored;
The bug is related to the COUNT function when using the * wildcard when there is no filtering clause;
The bug is not general as it affects only specific tables with specific index;
The bug reside at the query planner side.
Seems like some meta or state corruption prevent the query planner to find a column name to apply the COUNT method.
Question
My question is: How can I deeper investigate this bug to make it:
either reproducible (a dump technique preserving it);
or understandable to a developer (meta queries to identify where the problem resides in the database)?
Another way to phrase it would be:
How can I reproduce the context which is making the query planner crashes?
Is there a way to make the planner more verbose in order to get more details on the error?
What queries can I run against the catalog to capture the faulty context?

At what point are DB2 declared global temporary tables automatically deleted...?

When are DB2 declared global temporary tables 'cleaned up' and automatically deleted by the system...? This is for DB2 on AS400 v7r3m0, with DBeaver 5.2.5 as the dev client, and MS-Access 2007 for packaged apps for the end-users.
Today I started experimenting with a DGTT, thanks to this answer. So far I'm pleased with the functionality, although I did find our more recent system version has the WITH DATA option, which is an obvious advantage.
Everything is working, but at times I receive this error:
SQL Error [42710]: [SQL0601] NEW_PKG_SHEETS_DATA in QTEMP type *FILE already exists.
The meaning of the error is obvious, but the timing is not. When I started today, I could run the query multiple times, and the error didn't occur. It seemed as if the system was cleaning up and deleting it, which is just what I was looking for. But then the error started and now it's happening with more frequency.
If I make strategic use of DROP TABLE, this resolves the error, unless the table doesn't exist, in which case I get another error. I can also disconnect/reconnect to the server from my SQL dev client, as I would expect, since that would definitely drop the session.
This IBM article about DGTTs speaks much of sessions, but not many specifics. And this article is possibly the longest command syntax I've yet encountered in the IBM documentation. I got through it, but it didn't answer the question of what decided when a DGTT is deleted.
So I would like to ask:
What are the boundaries of a session..?
I'm thinking this is probably defined by the environment in my SQL client..?
I guess the best/safest thing to do is use DROP TABLE as needed..?
Does any one have any tips, tricks, or pointers they could share..?
Below is the SQL that I'm developing. For brevity, I've excluded chunks of the WITH-AS and SELECT statements:
DROP TABLE SESSION.NEW_PKG_SHEETS ;
DECLARE GLOBAL TEMPORARY TABLE SESSION.NEW_PKG_SHEETS_DATA
AS ( WITH FIRSTDAY AS (SELECT (YEAR(CURDATE() - 4 MONTHS) * 10000) +
(MONTH(CURDATE() - 4 MONTHS) * 100) AS DATEISO
FROM SYSIBM.SYSDUMMY1
-- <VARIETY OF ADDITIONAL CTE CLAUSES>
-- <SELECT STATEMENT BELOW IS A BIT LONGER>
SELECT DAACCT AS DAACCT,
DAIDAT AS DAIDAT,
DAINV# AS DAINV,
CAST(DAITEM AS NUMERIC(6)) AS DAPACK,
CAST(0 AS NUMERIC(14)) AS UPCNUM,
DAQTY AS DAQTY
FROM DAILYTRANS
AND DAIDAT >= (SELECT DATEISO+000 FROM FIRSTDAY) -- 1ST DAY FOUR MONTHS AGO
AND DAIDAT <= (SELECT DATEISO+399 FROM FIRSTDAY) -- LAST DAY OF LAST MONTH
) WITH DATA ;
DROP TABLE SESSION.NEW_PKG_SHEETS ;
The DGTT will only get cleaned automatically up by Db2 when the connection ends successfully (connect reset or equivalent according to whatever interface to Db2 is being used ).
For both Db2 for i and Db2-LUW, consider using the WITH REPLACE clause for the DECLARE GLOBAL TEMPORARY TABLE statement. That will ensure you don't need to explicitly drop the DGTT if the session remains open but the code needs the table to be replaced at next execution whether or not the DGTT already exists.
Using that WITH REPLACE clause means you do not need to worry about issuing a DROP statement for the DGTT, unless you really want to issue a drop.
Sometimes sessions may get re-used, or a close/disconnect might not happen or might not complete, or more likely a workstation performs a retry, and in those cases the WITH REPLACE can be essential for easily avoiding runtime errors.
Note that Db2 for Z/OS (at v12) does not offer the WITH REPLACE clause for DGTT, but has instead an optional syntax on commit drop table (but this is not documented for Db2-for-i and Db2-LUW).

Postgres table partition range by expression

I'm new to table partition in Postgres and I was trying something which didn't seem to work. Here's a simplified version of the problem:
cREATE table if not exists Location1
PARTITION OF "Location"
FOR VALUES FROM
(extract(epoch from now()::timestamp))
TO
(extract(epoch from now()::timestamp)) ;
insert into "Location" ("BuyerId", "Location")
values (416177285, point(-100, 100));
It fails with:
ERROR: syntax error at or near "extract"
LINE 2: FOR VALUES FROM (extract(epoch from now()::timestamp))
If I try to create the partition using literal values, it works fine.
I'm using Postgres 10 on linux.
After a lot of trial and error, I realized that Postgres 10 doesn't really have a full partitioning system out of the box. For a usable partitioned database, the following postgres modules are necessary to be installed:
pg_partman https://github.com/pgpartman/pg_partman (literally a gift from God)
pg_jobmon https://github.com/omniti-labs/pg_jobmon (to monitor what pg_partman is doing)
The above advice is for those new to the scene, to help you avoid a lot of headaches.
I assumed that automatic creation of partitions done by postgres-core was a no-brainer, but evidently the postgres devs know something I don't?
If any relational database from the 80's is asked to insert a row, it succeeds after passing the constraint checks.
The latest version of Postgres 2018 is asked to insert a row: "Sorry, I don't know where to put it."
It's a good first effort, but hopefully Postgres 11 will have a proper partitioning system OOTB...

Troubleshooting an insert statement, fails without error

I am trying to do what should be a pretty straightforward insert statement in a postgres database. It is not working, but it's also not erroring out, so I don't know how to troubleshoot.
This is the statement:
INSERT INTO my_table (col1, col2) select col1,col2 FROM my_table_temp;
There are around 200m entries in the temp table, and 50m entries in my_table. The temp table has no index or constraints, but both columns in my_table have btree indexes, and col1 has a foreign key constraint.
I ran the first query for about 20 days. Last time I tried a similar insert of around 50m, it took 3 days, so I expected it to take a while, but not a month. Moreover, my_table isn't getting longer. Queried 1 day apart, the following produces the same exact number.
select count(*) from my_table;
So it isn't inserting at all. But it also didn't error out. And looking at system resource usage, it doesn't seem to be doing much of anything at all, the process isn't drawing resources.
Looking at other running queries, nothing else that I have permissions to view is touching either table, and I'm the only one who uses them.
I'm not sure how to troubleshoot since there's no error. It's just not doing anything. Any thoughts about things that might be going wrong, or things to check, would be very helpful.
For the sake of anyone stumbling onto this question in the future:
After a lengthy discussion (see linked discussion from the comments above), the issue turned out to be related to psycopg2 buffering the query in memory.
Another useful note: inserting into a table with indices is slow, so it can help to remove them before bulk loads, and then add them again after.
in my case it was date format issue. i commented date attribute before interting to DB and it worked.
In my case it was a TRIGGER on the same table I was updating and it failed without errors.
Deactivated the trigger and the update worked flawlessly.

SQL Server temp tables via MS Access

Well I've been using #temp tables in standard T-SQL coding for years and thought I understood them.
However, I've been dragged into a project based in MS Access, utilizing pass-through queries, and found something that has really got me puzzled.
Though maybe it's the inner workings of Access that has me fooled !?
Here we go : Under normal usage, I understand the if I create a temp table in a Sproc, it's scope ends with the end of the SProc, and is dropped by default.
In the Access example, I found it was possible to do this in one Query:
select top(10) * into #myTemp from dbo.myTable
And then this in second separate query:
select * from #myTemp
How is this possible ?
If a temp table dies with the current session, does this mean that Access keeps a single session open, and uses that session for all Queries executed ?
Or has my fundamental understanding of scope been wrong all this time ?
Hope someone out there can help clarify what is occurring under the hood !?
Many Thanks
I found this answer of a kind of similar question:
Temp table is stored in tempdb until the connection is dropped (or in the case of a global temp tables when the last connection using it is dropped). You can also (and it is a good proctice to do so) manually drop the table when you are finished using it with a drop table statement.
I hope this helps out.