postgresql - integer out of range - postgresql

I've set up a table accordingly:
CREATE TABLE raw (
id SERIAL,
regtime float NOT NULL,
time float NOT NULL,
source varchar(15),
sourceport INTEGER,
destination varchar(15),
destport INTEGER,
blocked boolean
); ... + index and grants
I've successfully used this table for a while now, and all of a sudden the following insert doesn't work any longer..
INSERT INTO raw(
time, regtime, blocked, destport, sourceport, source, destination
) VALUES (
1403184512.2283964, 1403184662.118, False, 2, 3, '192.168.0.1', '192.168.0.2'
);
The error is: ERROR: integer out of range
Not even sure where to begin debugging this.. I'm not out of disk-space and the error itself is kinda discreet.

SERIAL columns are stored as INTEGERs, giving them a maximum value of 231-1. So after ~2 billion inserts, your new id values will no longer fit.
If you expect this many inserts over the life of your table, create it with a BIGSERIAL (internally a BIGINT, with a maximum of 263-1).
If you discover later on that a SERIAL isn't big enough, you can increase the size of an existing field with:
ALTER TABLE raw ALTER COLUMN id TYPE BIGINT;
Note that it's BIGINT here, rather than BIGSERIAL (as serials aren't real types). And keep in mind that, if you actually have 2 billion records in your table, this might take a little while...

Related

Redshift table size

This is more like a puzzling question for me and would like to understand why.
I have two tables, almost identical the only differences are one column's data type and sortkey.
table mbytes rows
stg_user_event_properties_hist 460948 2378751028
stg_user_event_properties_hist_1 246442 2513860837
Even though they have almost same number of rows, size is close to double.
Here are the table structures
stg.stg_user_event_properties_hist
(
id bigint,
source varchar(20),
time_of_txn timestamp,
product varchar(50),
region varchar(50),
city varchar(100),
state varchar(100),
zip varchar(10),
price integer,
category varchar(50),
model varchar(50),
origin varchar(50),
l_code varchar(10),
d_name varchar(100),
d_id varchar(10),
medium varchar(255),
network varchar(255),
campaign varchar(255),
creative varchar(255),
event varchar(255),
property_name varchar(100),
property_value varchar(4000),
source_file_name varchar(255),
etl_batch_id integer,
etl_row_id integer,
load_date timestamp
);
stg.stg_user_event_properties_hist_1
(
id bigint,
source varchar(20),
time_of_txn timestamp,
product varchar(50),
region varchar(50),
city varchar(100),
state varchar(100),
zip varchar(10),
price integer,
category varchar(50),
model varchar(50),
origin varchar(50),
l_code varchar(10),
d_name varchar(100),
d_id varchar(10),
medium varchar(255),
network varchar(255),
campaign varchar(255),
creative varchar(255),
event varchar(255),
property_name varchar(100),
property_value varchar(4000),
source_file_name varchar(255),
etl_batch_id integer,
etl_row_id varchar(20),
load_date timestamp
);
The differences again etl_row_id has data type varchar(20) in _1, integer in the other table, and the first table has a sortkey on source column.
What would be the explanation for the size difference?
UPDATE:
The problem was both compression and sort keys, even though _1 table created with CTAS 11 of 26 had different compression settings, also the first table was created with Compound SortKey of 14 columns, recreated the table with no sort keys (it's a history table after all) size went down to 231GB.
Suspect that the larger table has different compression settings or no compression at all. You can use our view v_generate_tbl_ddl to generate table DDL that includes the compression settings.
Even with the same compression settings table size can vary with different sort keys. The sort key is use to place the data into blocks on disk. If one sort key places lots of similar column values together it will compress better and require less space.
The sizes are different for these two tables because one table is being allocated more blocks than the other based on sortkeys. For your bigger table, the distribution is happening in such a way that the disk blocks are not fully occupied, thus needing more blocks to store the same amount of data.
This happens because of the 1MB block size of Redshift and the way it stores data across slices and nodes. In general, data gets distributed across different nodes and slices based on the diststyle. For your case I am assuming this distribution is happening in a round robin way. So slice1 gets the first record, slice2 gets second record etc. As the minimum block size is 1MB for Redshift, every time a new record goes to a new slice, 1MB gets allocated (even if the record only takes a few KBs). For subsequent records to the same slice, data goes to the same 1MB block till it is possible, after which a new 1MB block gets allocated on the slice. But, if there are no more records after the first record for this slice, it still occupies the first block of 1MB size. The total size of the table is the sum of all blocks being occupied (irrespective of how much data in present in the blocks)
The difference in table size could be due to the following reasons.
The encoding used for each column. (query PG_TABLE_DEF)
Distribution key used for the table. (query PG_TABLE_DEF)
Vaccum performed on the table. (query SVV_VACUUM_SUMMARY)
If I’ve made a bad assumption please comment and I’ll refocus my answer.

How can I create a sequence in PostgreSQL to add nextval to an id column where the table already exists? [duplicate]

In a Postgres 9.3 table I have an integer as primary key with automatic sequence to increment, but I have reached the maximum for integer. How to convert it from integer to serial?
I tried:
ALTER TABLE my_table ALTER COLUMN id SET DATA TYPE bigint;
But the same does not work with the data type serial instead of bigint. Seems like I cannot convert to serial?
serial is a pseudo data type, not an actual data type. It's an integer underneath with some additional DDL commands executed automatically:
Create a SEQUENCE (with matching name by default).
Set the column NOT NULL and the default to draw from that sequence.
Make the column "own" the sequence.
Details:
Safely rename tables using serial primary key columns
A bigserial is the same, built around a bigint column. You want bigint, but you already achieved that. To transform an existing serial column into a bigserial (or smallserial), all you need to do is ALTER the data type of the column. Sequences are generally based on bigint, so the same sequence can be used for any integer type.
To "change" a bigint into a bigserial or an integer into a serial, you just have to do the rest by hand:
Creating a PostgreSQL sequence to a field (which is not the ID of the record)
The actual data type is still integer / bigint. Some clients like pgAdmin will display the data type serial in the reverse engineered CREATE TABLE script, if all criteria for a serial are met.

Which data type to use to reference SERIAL data type in PostgreSQL?

I have two tables in PostgreSQL. The first one should have an auto-incrementing ID field that the second one references:
CREATE TABLE tableA (id SERIAL NOT NULL PRIMARY KEY, ...)
CREATE TABLE tableB (parent INTEGER NOT NULL REFERENCES tableA(id), ...)
According to documentation, SERIAL acts as unsigned 4-byte integer while INTEGER is signed:
serial 4 bytes autoincrementing integer 1 to 2147483647
integer 4 bytes typical choice for integer -2147483648 to +2147483647
If I understand correctly, the data types that I have used are not compatible, but PostgreSQL apparently lacks unsigned integers. I know I probably won't use more than 2*10^9 IDs (and if I did, I could always use BIGSERIAL), and it's not all that important, but it seems a bit unclean to me to have signed integer reference an unsigned one. I am sure there must be a better way - am I missing something?
A serial is an integer and it's not "unsigned". The sequence that is created automatically just happens to start at 1 - that's all. The column's data type is still an integer (you could make the sequence start at -2147483648 if you wanted to).
Quote from the manual
CREATE TABLE tablename (
colname SERIAL
);
is equivalent to
CREATE SEQUENCE tablename_colname_seq;
CREATE TABLE tablename (
colname integer NOT NULL DEFAULT nextval('tablename_colname_seq')
);
ALTER SEQUENCE tablename_colname_seq OWNED BY tablename.colname;
(emphasis mine)

Column Data Type wont change to serial [duplicate]

In a Postgres 9.3 table I have an integer as primary key with automatic sequence to increment, but I have reached the maximum for integer. How to convert it from integer to serial?
I tried:
ALTER TABLE my_table ALTER COLUMN id SET DATA TYPE bigint;
But the same does not work with the data type serial instead of bigint. Seems like I cannot convert to serial?
serial is a pseudo data type, not an actual data type. It's an integer underneath with some additional DDL commands executed automatically:
Create a SEQUENCE (with matching name by default).
Set the column NOT NULL and the default to draw from that sequence.
Make the column "own" the sequence.
Details:
Safely rename tables using serial primary key columns
A bigserial is the same, built around a bigint column. You want bigint, but you already achieved that. To transform an existing serial column into a bigserial (or smallserial), all you need to do is ALTER the data type of the column. Sequences are generally based on bigint, so the same sequence can be used for any integer type.
To "change" a bigint into a bigserial or an integer into a serial, you just have to do the rest by hand:
Creating a PostgreSQL sequence to a field (which is not the ID of the record)
The actual data type is still integer / bigint. Some clients like pgAdmin will display the data type serial in the reverse engineered CREATE TABLE script, if all criteria for a serial are met.

How to convert primary key from integer to serial?

In a Postgres 9.3 table I have an integer as primary key with automatic sequence to increment, but I have reached the maximum for integer. How to convert it from integer to serial?
I tried:
ALTER TABLE my_table ALTER COLUMN id SET DATA TYPE bigint;
But the same does not work with the data type serial instead of bigint. Seems like I cannot convert to serial?
serial is a pseudo data type, not an actual data type. It's an integer underneath with some additional DDL commands executed automatically:
Create a SEQUENCE (with matching name by default).
Set the column NOT NULL and the default to draw from that sequence.
Make the column "own" the sequence.
Details:
Safely rename tables using serial primary key columns
A bigserial is the same, built around a bigint column. You want bigint, but you already achieved that. To transform an existing serial column into a bigserial (or smallserial), all you need to do is ALTER the data type of the column. Sequences are generally based on bigint, so the same sequence can be used for any integer type.
To "change" a bigint into a bigserial or an integer into a serial, you just have to do the rest by hand:
Creating a PostgreSQL sequence to a field (which is not the ID of the record)
The actual data type is still integer / bigint. Some clients like pgAdmin will display the data type serial in the reverse engineered CREATE TABLE script, if all criteria for a serial are met.