SERIAL pseudo-type in POSTGRESQL? - postgresql

Currently I'm using serial datatype for my project. I want to know the following details to change previously Created SEQUENCE datatype to SERAIL datatype.
From Which version onwards POSTGRESQL supports SERAIL DATATYPE, also If I change my id into SERAIL it wont affect my future code.
What is the max size of serial and its impact?
Can ALTER SEQUENCE impact the Sequence numbers?
any drawback on serail datatype in future?
How to create gap-less sequence?

All answers can be found in the manual
serial goes back to Postgres 7.2
It's a bigint, the max size is documented in the manual. Also see this note in the documenation of CREATE SEQUENCE
Sequences are based on bigint arithmetic, so the range cannot exceed the range of an eight-byte integer (-9223372036854775808 to 9223372036854775807).
Obviously. As documented in the manual that command can set minvalue or restart with a new value and manipulate many other properties that affect the number generation.
You should use identity columns instead
Not possible - that's not what sequences are intended for. See e.g. here for a possible implementation of a gapless number generator.

Related

What type of code is used to create sequence in POSTGRESQL13?

Recently Which datatype is used for generating sequence in postgres13 ?
Creating Sequence separately or Using serial or Identity datatype ?
Which is best and why ?
Also Mapping between datatype is easy ?
As documented in the manual default data type is bigint if you don't specify a type.
The optional clause AS data_type specifies the data type of the sequence. [...] bigint is the default. The data type determines the default minimum and maximum values of the sequence.
(emphasis mine)
However with a current Postgres version it is recommended to use identity columns instead of serial
As documented in the manual you can control the details (data type most importantly) of the underlying sequence when you declare an identity column.

Does anybody using SERAIL type instead of SEQUENCE in POSTGRESQL 13 Recently?

Could you please help me to choose SERAIL vs Sequence in project..
Which is best for future use?
I'm using POSTGRESQL13 for project Can I use SERAIL type
There is no serial type. Using serial is more like a macro, which creates a column of type integer, and a sequence to fill it. If you describe the table it \d or dump the create command of a table created with serial, you will see it not described/dumped as serial.
One thing that using serial does it mark the sequence as being owned by the column, so that it will get dropped if the column gets dropped. You can accomplish this after-the-fact by running something like this:
ALTER SEQUENCE public.x_x_seq OWNED BY public.x.x;
For new work happening in v13 (or back to v10) you might consider using generated always as identity instead.

PostgreSQL sequence connects to columns

So im working on a database at the moment, and i can see there are loads of sequences. I was wondering how sequences link up to their corresponding column in order to increment the value.
for example if i create a new table with a column names ID how would i apply a sequence to that column.
Typically, sequences are created implicitly. With a serial column or (alternatively) with an IDENTITY column in Postgres 10 or later. Details:
Auto increment table column
Sequences are separate objects internally and can be "owned" by a column, which happens automatically for the above examples. (But you can also have free-standing sequences.) They are incremented with the dedicated function nextval() that is used for the column default of above columns automatically. More sequence manipulation functions in the manual.
Details:
Safely and cleanly rename tables that use serial primary key columns in Postgres?
Or you can use ALTER SEQUENCE to manipulate various properties.
Privileges on sequences have to be changed explicitly for serial columns, while that happens implicitly for the newer IDENTITY columns.

Can a primary key in postgres have zero value?

There is one table at my database that have a row with ID equals to 0 (zero).
The primary key is a serial column.
I'm used to see sequences starting with 1. So, is there a ploblem if i keep this ID as zero?
The Serial data type creates integer columns which happen to auto-increment. Hence you should be able to add any integer value to the column (including 0).
From the docs
The type names serial and serial4 are equivalent: both create integer columns.
....(more about Serial) we have created an integer column and arranged for its default values to be assigned from a sequence generator
http://www.postgresql.org/docs/current/static/datatype-numeric.html#DATATYPE-SERIAL
This is presented as an answer because it’s too long for a comment.
You’re actually talking about two things here.
A primary key is a column designated to be the unique identifier for the table. There may be other unique columns, but the primary key is the one you have settled on, possibly because it’s the most stable value. (For example a customer’s email address is unique, but it’s subject to change, and it’s harder to manage).
The primary key can be any common data type, as long as it is guaranteed to be unique. In some cases, the primary key is a natural property of the row data, in which case it is a natural primary key.
In (most?) other cases, the primary key is an arbitrary value with no inherent meaning. In that case it is called a surrogate key.
The simplest surrogate key, the one which I like to call the lazy surrogate key, is a serial number. Technically, it’s not truly surrogate in that there is an inherent meaning in the sequence, but it is otherwise arbitrary.
For PostgreSQL, the data type typically associated with a serial number is integer, and this is implied in the SERIAL type. If you were doing this in MySQL/MariaDB, you might use unsigned integer, which doesn’t have negative values. PostgreSQL doesn’t have unsigned, so the data can indeed be negative.
The point about serial numbers is that they normally start at 1 and increment by 1. In PostgreSQL, you could have set up your own sequence manually (SERIAL is just a shortcut for that), in which case you can start with any value you like, such as 100, 0 or even -100 etc.
To actually give an answer:
A primary key can have any compatible value you like, as long as it’s unique.
A serial number can also have any compatible value, but it is standard practice to start as 1, because that’s how we humans count.
Reasons to override the start-at-one principle include:
I sometimes use 0 as a sort of default if a valid row hasn’t been selected.
You might use negative ids to indicate non-standard data, such as for testing or for virtual values; for example a customer with a negative id might indicate an internal allocation.
You might start your real sequence from a higher number and use lower ids for something similar to the point above.
Note that modern versions of PostgreSQL have a preferred standard alternative in the form of GENERATED BY DEFAULT AS IDENTITY. In line with modern SQL trends, it is much more verbose, but it is much more manageable than the old SERIAL.

Create big integer from the big end of a uuid in PostgreSQL

I have a third-party application connecting to a view in my PostgreSQL database. It requires the view to have a primary key but can't handle the UUID type (which is the primary key for the view). It also can't handle the UUID as the primary key if it is served as text from the view.
What I'd like to do is convert the UUID to a number and use that as the primary key instead. However,
SELECT x'14607158d3b14ac0b0d82a9a5a9e8f6e'::bigint
Fails because the number is out of range.
So instead, I want to use SQL to take the big end of the UUID and create an int8 / bigint. I should clarify that maintaining order is 'desirable' but I understand that some of the order will change by doing this.
I tried:
SELECT x(substring(UUID::text from 1 for 16))::bigint
but the x operator for converting hex doesn't seem to like brackets. I abstracted it into a function but
SELECT hex_to_int(substring(UUID::text from 1 for 16))::bigint
still fails.
How can I get a bigint from the 'big end' half of a UUID?
Fast and without dynamic SQL
Cast the leading 16 hex digits of a UUID in text representation as bitstring bit(64) and cast that to bigint. See:
Convert hex in text representation to decimal number
Conveniently, excess hex digits to the right are truncated in the cast to bit(64) automatically - exactly what we need.
Postgres accepts various formats for input. Your given string literal is one of them:
14607158d3b14ac0b0d82a9a5a9e8f6e
The default text representation of a UUID (and the text output in Postgres for data type uuid) adds hyphens at predefined places:
14607158-d3b1-4ac0-b0d8-2a9a5a9e8f6e
The manual:
A UUID is written as a sequence of lower-case hexadecimal digits, in
several groups separated by hyphens, specifically a group of 8 digits
followed by three groups of 4 digits followed by a group of 12 digits,
for a total of 32 digits representing the 128 bits.
If input format can vary, strip hyphens first to be sure:
SELECT ('x' || translate(uuid_as_string, '-', ''))::bit(64)::bigint;
Cast actual uuid input with uuid::text.
db<>fiddle here
Note that Postgres uses signed integer, so the bigint overflows to negative numbers in the upper half - which should be irrelevant for this purpose.
DB design
If at all possible add a bigserial column to the underlying table and use that instead.
This is all very shaky, both the problem and the solution you describe in your self-answer.
First, a mismatch between a database design and a third-party application is always possible, but usually indicative of a deeper problem. Why does your database use the uuid data type as a PK in the first place? They are not very efficient compared to a serial or a bigserial. Typically you would use a UUID if you are working in a distributed environment where you need to "guarantee" uniqueness over multiple installations.
Secondly, why does the application require the PK to begin with (incidentally: views do not have a PK, the underlying tables do)? If it is only to view the data then a PK is rather useless, particularly if it is based on a UUID (and there is thus no conceivable relationship between the PK and the rest of the tuple). If it is used to refer to other data in the same database or do updates or deletes of existing data, then you need the exact UUID and not some extract of it because the underlying table or other relations in your database would have the exact UUID. Of course you can convert all UUID's with the same hex_to_int() function, but that leads straight back to my point above: why use uuids in the first place?
Thirdly, do not mess around with things you have little or no knowledge of. This is not intended to be offensive, take it as well-meant advice (look around on the internet for programmers who tried to improve on cryptographic algorithms or random number generation by adding their own twists of obfuscation; quite entertaining reads). There are 5 algorithms for generating UUID's in the uuid-ossp package and while you know or can easily find out which algorithm is used in your database (the uuid_generate_vX() functions in your table definitions, most likely), do you know how the algorithm works? The claim of practical uniqueness of a UUID is based on its 128 bits, not a 64-bit extract of it. Are you certain that the high 64-bits are random? My guess is that 64 consecutive bits are less random than the "square root of the randomness" (for lack of a better way to phrase the theoretical drop in periodicity of a 64-bit number compared to a 128-bit number) of the full UUID. Why? Because all but one of the algorithms are made up of randomized blocks of otherwise non-random input (such as the MAC address of a network interface, which is always the same on a machine generating millions of UUIDs). Had 64 bits been enough for randomized value uniqueness, then a uuid would have been that long.
What a better solution would be in your case is hard to say, because it is unclear what the third-party application does with the data from your database and how dependent it is on the uniqueness of the "PK" column in the view. An approach that is likely to work if the application does more than trivially display the data without any further use of the "PK" would be to associate a bigint with every retrieved uuid in your database in a (temporary) table and include that bigint in your view by linking on the uuids in your (temporary) tables. Since you can not trigger on SELECT statements, you would need a function to generate the bigint for every uuid the application retrieves. On updates or deletes on the underlying tables of the view or upon selecting data from related tables, you look up the uuid corresponding to the bigint passed in from the application. The lookup table and function would look somewhat like this:
CREATE TEMPORARY TABLE temp_table(
tempint bigserial PRIMARY KEY,
internal_uuid uuid);
CREATE INDEX ON temp_table(internal_uuid);
CREATE FUNCTION temp_int_for_uuid(pk uuid) RETURNS bigint AS $$
DECLARE
id bigint;
BEGIN
SELECT tempint INTO id FROM temp_table WHERE internal_uuid = pk;
IF NOT FOUND THEN
INSERT INTO temp_table(internal_uuid) VALUES (pk)
RETURNING tempint INTO id;
END IF;
RETURN id;
END; $$ LANGUAGE plpgsql STRICT;
Not pretty, not efficient, but fool-proof.
Use the bit() function to parse a decimal number from hex literal built from a substr of the UUID:
select ('x'||substr(UUID, 1, 16))::bit(64)::bigint
See SQLFiddle
Solution found.
UUID::text will return a string with hyphens. In order for substring(UUID::text from 1 for 16) to create a string that x can parse as hex the hyphens need to be stripped first.
The final query looks like:
SELECT hex_to_int(substring((select replace(id::text,'-','')) from 1 for 16))::bigint FROM table
The hext_to_int function needs to be able to handle a bigint, not just int. It looks like:
CREATE OR REPLACE FUNCTION hex_to_int(hexval character varying)
RETURNS bigint AS
$BODY$
DECLARE
result bigint;
BEGIN
EXECUTE 'SELECT x''' || hexval || '''::bigint' INTO result;
RETURN result;
END;
$BODY$`