i am trying to find out what exactly is avg_item_size in bt_page_stats?
I am using postgresql 13.4 with extension pageinspect.
In my case, if index is created on text it is about 20-200 and if it is on integer it is about 700, I am curious why is avg_item_size bigger with integer than text.
Update:
Here are bt_page_stats of my index created on single integer column. bt_page_stats
Here is same on bt_page_items.
In picture 2 I can see that itemlen prop is 24 and than I have 9 items with itemlen 808. So avg_item_size 729 is from there and it is averagy of all items.
Now I see, that if I have index created on string, there are no tids but on integer there is a lot of data in tids.
After future exploration I found source code of PostgreSQL> https://docs.huihoo.com/doxygen/postgresql/dir_57dbf4d3eda9e499038b5c7aaccc39c5.html directly to function pageinspect.
I was not sure about tids, but from https://www.postgresql.org/docs/8.3/datatype-oid.html I found the answer:
tid, or tuple identifier (row identifier). This is the data type of the system column ctid. A tuple ID is a pair (block number, tuple index within block) that identifies the physical location of the row within its table.
But I still do not understand why tids are when I create index on integer and not on text.
avg_item_size is the average size of an index entry.
For an index on a single integer column, that should be 16 (if you have duplicates, and you are using v13 or above, it can be less because of index de-duplication).
An index entry in a leaf page will consist of the t_tid (address of the row), which is 6 bytes, a 2-byte t_info and the integer (4 bytes, but really 8 bytes because of alignment). You can use bt_page_items to verify that.
Related
We often have columns that can contain values of varying sizes. For these, I like to set the data type to VARCHAR with a size way beyond the current maximum length. For example, if I have a column where the current minimum length for a value is 10 and the maximum length is 35, I might set the data type to VARCHAR(64). My rationale is that Db2 stores the 2 byte length followed by the exact value, therefore, there is no difference, from a storage perspective, defining the data type as VARCHAR(64) instead of VARCHAR(35). And I don't get an error if I a value with a length of 36 comes along.
Is there a nuance that I'm missing and should I not be so glib about my VARCHAR assignments?
The exact formula to calculate row length is described in the docs for CREATE TABLE. VARCHAR(64) or VARCHAR(35) should not make a difference.
Be aware that rows a stored in data pages in tablespaces. Database systems usually pre-allocate pages for performance reasons. Moreover, pages might not be fully filled or there is compression. And you might have defined indexes which require their own pages with structures. Plus there is metadata in the system catalog.
I got a large (>100M rows) Postgres table with structure {integer, integer, integer, timestamp without time zone}. I expected the size of a row to be 3*integer + 1*timestamp = 3*4 + 1*8 = 20 bytes.
In reality the row size is pg_relation_size(tbl) / count(*) = 52 bytes. Why?
(No deletes are done against the table: pg_relation_size(tbl, 'fsm') ~= 0)
Calculation of row size is much more complex than that.
Storage is typically partitioned in 8 kB data pages. There is a small fixed overhead per page, possible remainders not big enough to fit another tuple, and more importantly dead rows or a percentage initially reserved with the FILLFACTOR setting.
And there is even more overhead per row (tuple): an item identifier of 4 bytes at the start of the page, the HeapTupleHeader of 23 bytes and alignment padding. The start of the tuple header as well as the start of tuple data are aligned at a multiple of MAXALIGN, which is 8 bytes on a typical 64-bit machine. Some data types require alignment to the next multiple of 2, 4 or 8 bytes.
Quoting the manual on the system table pg_tpye:
typalign is the alignment required when storing a value of this type.
It applies to storage on disk as well as most representations of the
value inside PostgreSQL. When multiple values are stored
consecutively, such as in the representation of a complete row on
disk, padding is inserted before a datum of this type so that it
begins on the specified boundary. The alignment reference is the
beginning of the first datum in the sequence.
Possible values are:
c = char alignment, i.e., no alignment needed.
s = short alignment (2 bytes on most machines).
i = int alignment (4 bytes on most machines).
d = double alignment (8 bytes on many machines, but by no means all).
Read about the basics in the manual here.
Your example
This results in 4 bytes of padding after your 3 integer columns, because the timestamp column requires double alignment and needs to start at the next multiple of 8 bytes.
So, one row occupies:
23 -- heaptupleheader
+ 1 -- padding or NULL bitmap
+ 12 -- 3 * integer (no alignment padding here)
+ 4 -- padding after 3rd integer
+ 8 -- timestamp
+ 0 -- no padding since tuple ends at multiple of MAXALIGN
Plus item identifier per tuple in the page header (as pointed out by #A.H. in the comment):
+ 4 -- item identifier in page header
------
= 52 bytes
So we arrive at the observed 52 bytes.
The calculation pg_relation_size(tbl) / count(*) is a pessimistic estimation. pg_relation_size(tbl) includes bloat (dead rows) and space reserved by fillfactor, as well as overhead per data page and per table. (And we didn't even mention compression for long varlena data in TOAST tables, since it doesn't apply here.)
You can install the additional module pgstattuple and call SELECT * FROM pgstattuple('tbl_name'); for more information on table and tuple size.
Related:
Table size with page layout
Calculating and saving space in PostgreSQL
Each row has metadata associated with it. The correct formula is (assuming naïve alignment):
3 * 4 + 1 * 8 == your data
24 bytes == row overhead
total size per row: 23 + 20
Or roughly 53 bytes. I actually wrote postgresql-varint specifically to help with this problem with this exact use case. You may want to look at a similar post for additional details re: tuple overhead.
Apparently PostgreSQL stores a couple of values in the header of each database row.
If I don't use NULL values in that table - is the null bitmap still there?
Does defining the columns with NOT NULL make any difference?
It's actually more complex than that.
The null bitmap needs one bit per column in the row, rounded up to full bytes. It is only there if the actual row includes at least one NULL value and is fully allocated in that case. NOT NULL constraints do not directly affect that. (Of course, if all fields of your table are NOT NULL, there can never be a null bitmap.)
The "heap tuple header" (per row) is 23 bytes long. Actual data starts at a multiple of MAXALIGN (Maximum data alignment) after that, which is typically 8 bytes on 64-bit OS (4 bytes on 32-bit OS). Run the following command from your PostgreSQL binary dir as root to get a definitive answer:
./pg_controldata /path/to/my/dbcluster
On a typical Debian-based installation of Postgres 12 that would be:
sudo /usr/lib/postgresql/12/bin/pg_controldata /var/lib/postgresql/12/main
Either way, there is one free byte between the header and the aligned start of the data, which the null bitmap can utilize. As long as your table has 8 columns or less, NULL storage is effectively absolutely free (as far as disk space is concerned).
After that, another MAXALIGN (typically 8 bytes) is allocated for the null bitmap to cover another (typically) 64 fields. Etc.
This is valid for at least versions 8.4 - 12 and most likely won't change.
The null bitmap is only present if the HEAP_HASNULL bit is set in t_infomask. If it is present it begins just after the fixed header and occupies enough bytes to have one bit per data column (that is, t_natts bits altogether). In this list of bits, a 1 bit indicates not-null, a 0 bit is a null. When the bitmap is not present, all columns are assumed not-null.
http://www.postgresql.org/docs/9.0/static/storage-page-layout.html#HEAPTUPLEHEADERDATA-TABLE
so for every 8 columns you use one byte of extra storage. Then for every about million rows that would take up one megabyte of storage. Does not really seem that important. I would define the tables how they needed to be defined and not worry about null headers.
Using Oracle SQL*Loader, I am trying to load a column that was a variable length string (lob) in another database into a varchar2(4000) column in Oracle. We have strings much longer than 4000 characters, but everyone has agreed that these strings can and should be truncated in the migration (we've looked at the data that goes beyond 4000 characters, it's not meaningful). To do so, I specified the column this way in the control file:
COMMENTS CHAR(65535) "SUBSTR(:COMMENTS, 1, 4000)",
However, SQL*Loader still rejects any row where this record is longer than 4000 characters in the data file:
Record 6484: Rejected - Error on table LOG_COMMENT, column COMMENTS.
ORA-12899: value too large for column COMMENTS (actual: 11477, maximum: 4000)
Record 31994: Rejected - Error on table LOG_COMMENT, column COMMENTS.
ORA-12899: value too large for column COMMENTS (actual: 16212, maximum: 4000)
Record 44063: Rejected - Error on table LOG_COMMENT, column COMMENTS.
ORA-12899: value too large for column COMMENTS (actual: 62433, maximum: 4000)
I tried taking a much smaller substring and still got the same error. How can I change my control file to truncate string data longer than 4000 characters into a varchar2(4000) column?
Check to make sure your data ENCODING and Oracle ENCODING are not conflict.
In this case, use CHARACTERSET option when loading.
by all accounts
COMMENTS CHAR(65535) "SUBSTR(:COMMENTS, 1, 4000)",
is the correct syntax.
using sqlldr 11.2.0.1 it works successfully for me up until the point where the input record column is > 4000 where i get a
ORA-01461: can bind a LONG value only for insert into a LONG column
if i switch to a directpath load then i get the smae error as you.
ORA-12899: value too large for column COMMENTS (actual: 4005, maximum: 4000)
in the end i have split it into a 2 stage load.. i now have a staging table with a column of type CLOB which i load with
COMMENTS CHAR(2000000000)
which then gets inserted to eth main table with a
insert into propertable
select dbms_lob.substr(comments,1,4000)
from staging_table;
hope thats helpful
In my tables, i have selected id column to be of type int (4 bytes). The answer i want to know is how will any database handle it once it's limits are reached? Will the database refuse to insert any more records in table? or what exactly will happen? Also how should i tackle this type of problem (if the database doesn't handle it by itself)?
I would love to know what you refer to as "the" database. But usually it is an error. The database is full then. You should provide some means for "compacting" the primary key then, or more easy:
Use long integers as keys (8 bytes). Even if you insert 1000 items per second from now on, this will last for nearly 300 million years. The 4 byte integer (signed) will only last for 24 days in this scenario.