db2: correlate tablespace file to database object - db2

Using DB2 v9.7 (windows), with an SMS tablespace.
Inside the tablespace folder are files for the various db objects.
Ex) SQL00003.IN1, SQL00003.DAT, etc..
How do I determine which database object corresponds to which file?
(for both indexes and tables)

The digits in the file name (i.e. 00003 = 3) correspond to the TABLEID column from SYSCAT.TABLES. Please note that TABLEID is unique only within a single tablespace, so you need to know what tablespace's container path you are looking at to make this correlation.
All table data is stored in the .DAT file.
All index data (for all indexes) is stored in the .INX file, regardless of how many indexes there are. (Note that it appears you have a typo in the filename SQL00003.IN1 above, this should be SQL00003.INX)
If your table has LOBs, then there will be 2 additional files with the same SQLxxxxx name: a .LBA and a .LB file.

Related

Selecting data from a BYTEA data type in Postgres that contains CSV data and storing it in a table

I have a table ("file_upload") in a postgreSQL (11,8) database, which we use for storing the original CSV file that was used for loading some data to our system (I guess the question of best practices is up for debate here, but for now lets just assume it is).
The files are stored in a column ("file") which is of the data type "bytea"
So one row of this table contains
id - file_name - upload_date - uploaded_by - file <-- this being the column in question.
This column then stores the data of a csv file:
item_id;item_type_id;item_date;item_value
11;1;2022-09-22;123.45
12;4;2022-09-20;235.62
13;1;2022-09-21;99.99
14;2;2022-09-19;654.32
What I need to be able to do is query this column, extracrt the data and store it in a temporary table (note: the structure of these csv files are all the same, so the table structure can be pre-defined and does not have to be dynamic or anything).
Any help would be greatly appreciated
Use
COPY (SELECT file FROM file_upload WHERE id =1)
TO '/tmp/blob' (FORMAT 'binary');
to re-export the data to a file. Then create the temporary table and use COPY to read them in again. Make sure to use the proper ENCODING.
You can wrap that in a loop that performs this operation for all rows in your table.

Can we change data directory for single table or database in postgresql

Can we change data directory for single table or database in postgresql.
Actually my requirement is that I want to keep all tables data in C drive but customers table data in D drive. how to achieve this?
You should create a tablespace for the tables outside the data directory.
For example:
CREATE TABLESPACE tbsp LOCATION 'D:\customer_tables';
Then add TABLESPACE tbsp to all CREATE TABLE statements that should be on D.

What is stored in pg_default tablespace?

I have ~2.5 TB database, which is divided into tablespaces. The problem is that ~250 GB are stored in pg_defalut tablespace.
I have 3 tables and 6 tablespaces: 1 per each table and 1 for its index. Each tablespace directory is not empty, so there are no missing tablespaces for some tables/indexes. But the size of data/main/base/OID_of_database directory is about 250 GB.
Can anyone tell me what is stored there, is it OK, and if not, how can I move it to tablespace?
I am using PostgreSQL 10.
Inspect the base subdirectory of the data directory. It will contain number directories that correspond to your databases and perhaps a pgsql_tmp directory.
Find out which directory contains the 250GB. Map directory names to databases using
SELECT oid, datname
FROM pg_database;
Once you have identified the directory, change into it and see what it contains.
Map the numbers to database objects using
SELECT relname, relkind, relfilenode
FROM pg_class;
(Make sure you are connected to the correct database.)
Now you know which objects take up the space.
If you had frequent crashes during operations like ALTER TABLE or VACUUM (FULL), the files may be leftovers from that. They can theoretically be deleted, but I wouldn't do that without consulting with a PostgreSQL expert.

Why querying from binary file use more main memory space than text file through Postgres file_fdw?

I am using Postgres file data wrapper(file_fdw) to perform query from a remote file. Interestingly, I saw that if Postgres perform a query on a binary file, it uses more main memory space than text file such as .csv. Can anyone confirm why is this happen? Is there any way to optimize this?
My file contains two column (id | geometry), here geometry represents polygon. So, my foreign table consists of two column. I have tested with join query like ST_Overlaps() and ST_Contains() on CSV file and Postgres compatible binary file.
SELECT COUNT(*) FROM ftable1 a, ftable2 b WHERE ST_Overlaps(a.geom, b.geom);
I have checked the memory usage by PostgreSQL using htop, I saw that if Postgres perform a query on a binary file, it uses more main memory space than CSV file. For example, if it is 500MB for CSV then almost 1GB for a binary file. why?

copy csv postgres ignore rows that violate constraints

I have a .csv file with ~300,000 rows, some of which violate certain constraints I set in my postgres database. Is there a way to copy my .csv file into the database and have postgres filter out the rows that violate the constraints? I do not want these rows to show up in the database.
If this is not possible, is there any other way to solve this problem?
what I'm doing right now is
COPY blocksequences from '/tmp/blocksequences.csv CSV HEADER;
And I get
'ERROR: new row for relation "blocksequences" violates check constraint "blocksequences_partid3_check"
DETAIL: Failing row contains (M001-M049-S186, M001, null, M049, S186).
CONTEXT: COPY blocksequences, line 680: "M001-M049-S186,M001,,M049,S186"
reason for the error: column that contains M049 is not allowed to have that string entered. Many other rows have violations like this.
I read a little about exception when check violation --do nothing am I on the right track here? seems like it's only a mysql thing maybe
Usually this is done in this way:
create a temporary table with the same structure as the destination one but without constraints,
copy data to the temporary table with COPY command,
copy rows that do fulfill constraints from temp table to the destination one, using INSERT command with conditions in the WHERE clause based on the table constraint,
drop the temporary table.
When dealing with really large CSV files or very limited server resources, use the extension file_fdw instead of temporary tables. It's much more efficient way but it requires server access to a CSV file (while copying to a temporary table can be done over the network).
In Postgres 12 you can use the WHERE clause in COPY FROM.