Is it possible to declare a serial field in Postgres (9.0) which will increment based on a pattern?
For example:
Pattern: YYYY-XXXXX
where YYYY is a year, and XXXXX increments from 00000 - 99999.
Or should I just use a trigger?
EDIT: I prefer the year to be auto-determined based, maybe, on server date. The XXXXX part does start with 00000 for each year and "resets" to 00000 then increments again to 99999 when the year part is modified.
I would create a separate SEQUENCE for each year, so that each sequence keeps track of one year - even after that year is over, should you need more unique IDs for that year later.
This function does it all:
Improved with input from #Igor and #Clodoaldo in the comments.
CREATE OR REPLACE FUNCTION f_year_id(y text = to_char(now(), 'YYYY'))
RETURNS text AS
$func$
BEGIN
LOOP
BEGIN
RETURN y ||'-'|| to_char(nextval('year_'|| y ||'_seq'), 'FM00000');
EXCEPTION WHEN undefined_table THEN -- error code 42P01
EXECUTE 'CREATE SEQUENCE year_' || y || '_seq MINVALUE 0 START 0';
END;
END LOOP;
END
$func$ LANGUAGE plpgsql VOLATILE;
Call:
SELECT f_year_id();
Returns:
2013-00000
Basically this returns a text of your requested pattern. Automatically tailored for the current year. If a sequence of the name year_<year>_seq does not exist yet, it is created automatically and nextval() is retried.
Note that you cannot have an overloaded function without parameter at the same time (like my previous example), or Postgres will not know which to pick and throw an exception in despair.
Use this function as DEFAULT value in your table definition:
CREATE TABLE tbl (id text DEFAULT f_year_id(), ...)
Or you can get the next value for a year of your choice:
SELECT f_year_id('2012');
Tested in Postgres 9.1. Should work in v9.0 or v9.2 just as well.
To understand what's going on here, read these chapters in the manual:
CREATE FUNCTION
CREATE SEQUENCE
39.6.3. Simple Loops
39.5.4. Executing Dynamic Commands
39.6.6. Trapping Errors
Appendix A. PostgreSQL Error Codes
Table 9-22. Template Pattern Modifiers for Date/Time Formatting
You can create a function that will form this value (YYYY-XXXXX) and set this function as a default for a column.
Details here.
Related
CREATE TABLE Person (
id serial primary key,
accNum text UNIQUE GENERATED ALWAYS AS (
concat(right(cast(extract year from current_date) as text), 2), cast(id as text)) STORED
);
Error: generation expression is not immutable
The goal is to populate the accNum field with YYid where YY is the last two letters of the year when the person was added.
I also tried the '||' operator but it was unsuccessful.
As you don't expect the column to be updated, when the row is changed, you can define your own function that generates the number:
create function generate_acc_num(id int)
returns text
as
$$
select to_char(current_date, 'YY')||id::text;
$$
language sql
immutable; --<< this is lying to Postgres!
Note that you should never use this function for any other purpose. Especially not as an index expression.
Then you can use that in a generated column:
CREATE TABLE Person
(
id integer generated always as identity primary key,
acc_num text UNIQUE GENERATED ALWAYS AS (generate_acc_num(id)) STORED
);
As #ScottNeville correctly mentioned:
CURRENT_DATE is not immutable. So it cannot be used int a GENERATED ALWAYS AS expression.
However, you can achieve this using a trigger nevertheless:
demo:db<>fiddle
CREATE FUNCTION accnum_trigger_function()
RETURNS TRIGGER
LANGUAGE PLPGSQL
AS $$
BEGIN
NEW.accNum := right(extract(year from current_date)::text, 2) || NEW.id::text;
RETURN NEW;
END
$$;
CREATE TRIGGER tr_accnum
BEFORE INSERT
ON person
FOR EACH ROW
EXECUTE PROCEDURE accnum_trigger_function();
As #a_horse_with_no_name mentioned correctly in the comments: You can simplify the expression to:
NEW.accNum := to_char(current_date, 'YY') || NEW.id;
I am not exactly sure how to solve this problem (maybe a trigger), but current_date is a stable function not an immutable one. For the generated IDs I believe all function calls must be immutable. You can read more here https://www.postgresql.org/docs/current/xfunc-volatility.html
I dont think any function that gets the date can be immutable as Postgres defines this as "An IMMUTABLE function cannot modify the database and is guaranteed to return the same results given the same arguments forever." This will not be true for anything that returns the current date.
I think your best bet would be to do this with a trigger so on insert it sets the value.
I am trying to convert an oracle stored procedure to Postgres function/procedure.
I did some research and read many forums to prepare the syntax for the stored procedure in postgres.
But getting an error for declaring an integer variable.
My code is like below:
The purpose of my procedure is to load one-month records into another month (EX: load Jan 2020 data into March 2020)
Postgres Procedure:
CREATE OR REPLACE FUNCTION Corporate.copy_forecast(code OUT integer, message OUT VARCHAR)
LANGUAGE plpgsql
AS $$
DECLARE
v_current_month integer;
v_previous_month integer;
begin
select max(cycleid) into v_previous_month, max(cycleid)+1 into v_current_month from Corporate.forecast;
INSERT INTO Corporate.forecast
(SELECT v_current_month,lob,delivery,forecast_val
FROM Corporate.forecast
WHERE month= v_previous_month);
code:=1;
message:='Sucussfully loaded previous month forecast to current month';
exception
when others then
code:=0;
message:='Failed';
END;
$$
Please help me to fix the above procedure.
There are several problems with your code:
The first one is simple: you say LANGUAGE sql, but you write PL/pgSQL code.
That explains the error message you get. Use LANGUAGE plpgsql if you want to write PL/pgSQL.
You are using variables that are the same as column names, which leads to ambiguity. For example, you declare
current_month integer;
previous_month integer;
but you have a WHERE clause
WHERE current_month = previous_month
where obviously one of them should refer to a variable and the other to a table column. That will not work and cause errors.
The best and simplest solution is to always use variable names that are different from column names. A simple method is to start all variable names with v_.
A second option is to always qualify columns with the table name and variables with the function name.
You do not have a problem declaring an integer variable it is much larger you have structural problems. Below I indicate those issues and will then show corrections. The indicator "--<< ..." discusses the line(s) just above it.
CREATE OR REPLACE FUNCTION copy_forecast(code OUT integer, message OUT VARCHAR)
--<< Improper format Should be (out code integer, out message varchar)
--<< and while not invalid IMHO bad design a function/procedure should just do its job correctly or raise an exception
returns ???
--<< This is missing. A Postgres Function MUST declare what it returns, if nothing then RETURNS VOID. But I guess this was an Oracle Procedure.
LANGUAGE plpgsql
AS $$
DECLARE
v_current_month integer;
v_previous_month integer;
begin
select max(cycleid) into v_previous_month, max(cycleid)+1 into v_current_month from forecast;
--<< Invalid format Should Be select var1,var2 into local1, local2 ...
INSERT INTO forecast
--<< Very dangerous. If table ever changes this will fail.
(SELECT v_current_month,lob,delivery,forecast_val
FROM forecast
WHERE month= v_previous_month);
code:=1;
message:='Successfully loaded previous month forecast to current month';
exception
when others then
code:=0;
message:='Failed'
--<< Very dangerous (all 3 lines). When an error occurs you will never know what it is. See insert above
;
END;
$$;
The following corrects the errors indicated above. Again the indicator (--*) discussed the line(s) above it.
create or replace function copy_forecast()
returns boolean
language plpgsql
as $$
declare
v_current_month integer;
v_previous_month integer;
begin
select max(cycleid)
, max(cycleid)+1
into v_previous_month
, v_current_month
from forecast;
insert into forecast (cycleid,lob,delivery,forcast_val )
select v_current_month,lob,delivery,forecast_val
from forecast
where month= v_previous_month);
--* insert will fail I cannot resolve. MONTH not inserted, but must exist on table.
--* It seems you are using cycleid and month as synonyms
return True;
--* you can leave the out parameters if desirded. Set them before the return
exception
when others then
-- Log the error and debug information here
return false;
--* you can leave out parameters if desired Set them before the return.
end;
This should correct the structure issues with your function. However, the logical issue with month/cycleid remains.
However, this does not actually get you to your goal. As stated "the purpose of my procedure is to load one-month records into another month (EX: load Jan 2020 data into March 2020)". This function CANNOT do that. It can only copy the latest month/cycleid to the next month/cycleid; so Jan->Feb, Feb->Mar, ... Nov->Dec. But Dec->Jan would fail unless month/cycleid can be 13, and subsequent months would continue increasing month/cycle. Accomplishing your goal required INPUT parameters for the Source(from) and Target(to). And as #LaurenzAlbe said "no moving targets". That would be another question.
My idea is to implement a basic «vector clock», where a timestamps are clock-based, always go forward and are guaranteed to be unique.
For example, in a simple table:
CREATE TABLE IF NOT EXISTS timestamps (
last_modified TIMESTAMP UNIQUE
);
I use a trigger to set the timestamp value before insertion. It basically just goes into the future when two inserts arrive at the same time:
CREATE OR REPLACE FUNCTION bump_timestamp()
RETURNS trigger AS $$
DECLARE
previous TIMESTAMP;
current TIMESTAMP;
BEGIN
previous := NULL;
SELECT last_modified INTO previous
FROM timestamps
ORDER BY last_modified DESC LIMIT 1;
current := clock_timestamp();
IF previous IS NOT NULL AND previous >= current THEN
current := previous + INTERVAL '1 milliseconds';
END IF;
NEW.last_modified := current;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
DROP TRIGGER IF EXISTS tgr_timestamps_last_modified ON timestamps;
CREATE TRIGGER tgr_timestamps_last_modified
BEFORE INSERT OR UPDATE ON timestamps
FOR EACH ROW EXECUTE PROCEDURE bump_timestamp();
I then run a massive amount of insertions in two separate clients:
DO
$$
BEGIN
FOR i IN 1..100000 LOOP
INSERT INTO timestamps DEFAULT VALUES;
END LOOP;
END;
$$;
As expected, I get collisions:
ERROR: duplicate key value violates unique constraint "timestamps_last_modified_key"
État SQL :23505
Détail :Key (last_modified)=(2016-01-15 18:35:22.550367) already exists.
Contexte : SQL statement "INSERT INTO timestamps DEFAULT VALUES"
PL/pgSQL function inline_code_block line 4 at SQL statement
#rach suggested to mix current_clock() with a SEQUENCE object, but it would probably imply getting rid of the TIMESTAMP type. Even though I can't really figure out how it'd solve the isolation problem...
Is there a common pattern to avoid this?
Thank you for your insights :)
If you have only one Postgres server as you said, I think that using timestamp + sequence can solve the problem because sequence are non transactional and respect the insert order.
If you have db shard then it will be much more complex but maybe the distributed sequence of 2ndquadrant in BDR could help but I don't think that ordinality will be respected. I added some code below if you have setup to test it.
CREATE SEQUENCE "timestamps_seq";
-- Let's test first, how to generate id.
SELECT extract(epoch from now())::bigint::text || LPAD(nextval('timestamps_seq')::text, 20, '0') as unique_id ;
unique_id
--------------------------------
145288519200000000000000000010
(1 row)
CREATE TABLE IF NOT EXISTS timestamps (
unique_id TEXT UNIQUE NOT NULL DEFAULT extract(epoch from now())::bigint::text || LPAD(nextval('timestamps_seq')::text, 20, '0')
);
INSERT INTO timestamps DEFAULT VALUES;
INSERT INTO timestamps DEFAULT VALUES;
INSERT INTO timestamps DEFAULT VALUES;
select * from timestamps;
unique_id
--------------------------------
145288556900000000000000000001
145288557000000000000000000002
145288557100000000000000000003
(3 rows)
Let me know if that works. I'm not a DBA so maybe it will be good to ask on dba.stackexchange.com too about the potential side effect.
My two cents (Inspired from http://tapoueh.org/blog/2013/03/15-batch-update).
try adding the following before massive amount of insertions:
LOCK TABLE timestamps IN SHARE MODE;
Official documentation is here: http://www.postgresql.org/docs/current/static/sql-lock.html
I am inserting a lot of measurement data from different sources in a postgres database. The data are a measured value and an uncertainty (and a lot of auxilliary data) The problem is that in some cases I get an absolute error value eg 123 +/- 33, in other cases, I get a relative error as a percentage of the measured value, eg 123 +/- 10%. I would like to store all the measurements with absolute error, i.e the latter should be stored as 123 +/- 12.3 - (at this point, I don't care too much about the valid number of digits)
My idea is to use a trigger to do this. Basically, if the error is numeric, store it as is, if it is non-numeric, check if the last character is '%', in that case, multiply it with the measured value, divide by 100 and store the resulting value. I got an isnumeric-function from here: isnumeric() with PostgreSQL which works fine. But when I try to make this into a trigger, it seems as if the input is checked for validity even before the trigger fires, so that the insert is abortet before I get any possibility to do anything with the values.
my triggerfunction: (need to do the calculation, just setting the error to 0 here...
create function my_trigger_function()
returns trigger as'
begin
if not isnumeric(new.err) THEN
new.err=0;
end if;
return new;
end' language 'plpgsql';
then I connect it to the table:
create trigger test_trigger
before insert on test
for each row
execute procedure my_trigger_function();
Doing this, I would expect to get val=123 and err=0 for the following insert
insert into test(val,err) values(123,'10%');
but the insert fails with "invalid input syntax for type numeric" which then must be triggered before my trigger gets any possibility to see the data (or I have misunderstood something basic). Is it possible to make the new.err data-type agnostic or can I run the trigger even earlier or is what I want to do just plain impossible?
It's not possible with a trigger because the SQL parser fails before.
When the trigger is launched, the NEW.* columns already have their definitive types matching the destination columns.
The closest alternative is to provide a function converting from text to numeric implementing your custom syntax rules and apply it in the VALUES clause:
insert into test(val,err) values(123, custom_convert('10%'));
Daniel answered on my original question - and I found out I had to think otherwise. His proposal for how to do it may work for others, but the way my system interfaces to the database by fetching table and column names directly from the database, it would not work well.
Instead I added a boolean field relerr to the measurement table
alter table measure add relerr boolean default false;
Then I made a trigger that checks if relerr is true - indicating that I am trying to store a relative error, if so, it recalculates the error column (called prec for precision)
CREATE FUNCTION calc_fromrel_error()
RETURNS trigger as'
IF NEW.relerr THEN
NEW.prec=NEW.prec*NEW.value/100;
NEW.relerr=FALSE;
END IF;
return NEW;
END' language 'plpgsql';
and then
create trigger meas_calc_relerr_trigger
before update on measure
for each row
execute procedure calc_fromrel_error();
voila, by doing a
INSERT into measure (value,prec,relerr) values(220,10,true);
I get the table populated with 220,22,false. Inserted values should normally never be updated, if that for some strange reason should happen, I will be able to calculate the prec column manually.
Say I have a table like posts, which has typical columns like id, body, created_at. I'd like to generate a unique string with the creation of each post, for use in something like a url shortener. So maybe a 10-character alphanumeric string. It needs to be unique within the table, just like a primary key.
Ideally there would be a way for Postgres to handle both of these concerns:
generate the string
ensure its uniqueness
And they must go hand-in-hand, because my goal is to not have to worry about any uniqueness-enforcing code in my application.
I don't claim the following is efficient, but it is how we have done this sort of thing in the past.
CREATE FUNCTION make_uid() RETURNS text AS $$
DECLARE
new_uid text;
done bool;
BEGIN
done := false;
WHILE NOT done LOOP
new_uid := md5(''||now()::text||random()::text);
done := NOT exists(SELECT 1 FROM my_table WHERE uid=new_uid);
END LOOP;
RETURN new_uid;
END;
$$ LANGUAGE PLPGSQL VOLATILE;
make_uid() can be used as the default for a column in my_table. Something like:
ALTER TABLE my_table ADD COLUMN uid text NOT NULL DEFAULT make_uid();
md5(''||now()::text||random()::text) can be adjusted to taste. You could consider encode(...,'base64') except some of the characters used in base-64 are not URL friendly.
All existing answers are WRONG because they are based on SELECT while generating unique index per table record. Let us assume that we need unique code per record while inserting: Imagine two concurrent INSERTs are happening same time by miracle (which happens very often than you think) for both inserts same code was generated because at the moment of SELECT that code did not exist in table. One instance will INSERT and other will fail.
First let us create table with code field and add unique index
CREATE TABLE my_table
(
code TEXT NOT NULL
);
CREATE UNIQUE INDEX ON my_table (lower(code));
Then we should have function or procedure (you can use code inside for trigger also) where we 1. generate new code, 2. try to insert new record with new code and 3. if insert fails try again from step 1
CREATE OR REPLACE PROCEDURE my_table_insert()
AS $$
DECLARE
new_code TEXT;
BEGIN
LOOP
new_code := LOWER(SUBSTRING(MD5(''||NOW()::TEXT||RANDOM()::TEXT) FOR 8));
BEGIN
INSERT INTO my_table (code) VALUES (new_code);
EXIT;
EXCEPTION WHEN unique_violation THEN
END;
END LOOP;
END;
$$ LANGUAGE PLPGSQL;
This is guaranteed error free solution not like other solutions on this thread
Use a Feistel network. This technique works efficiently to generate unique random-looking strings in constant time without any collision.
For a version with about 2 billion possible strings (2^31) of 6 letters, see this answer.
For a 63 bits version based on bigint (9223372036854775808 distinct possible values), see this other answer.
You may change the round function as explained in the first answer to introduce a secret element to have your own series of strings (not guessable).
The easiest way probably to use the sequence to guarantee uniqueness
(so after the seq add a fix x digit random number):
CREATE SEQUENCE test_seq;
CREATE TABLE test_table (
id bigint NOT NULL DEFAULT (nextval('test_seq')::text || (LPAD(floor(random()*100000000)::text, 8, '0')))::bigint,
txt TEXT
);
insert into test_table (txt) values ('1');
insert into test_table (txt) values ('2');
select id, txt from test_table;
However this will waste a huge amount of records. (Note: the max bigInt is 9223372036854775807 if you use 8 digit random number at the end, you can only have 922337203 records. Thou 8 digit is probably not necessary. Also check the max number for your programming environment!)
Alternatively you can use varchar for the id and even convert the above number with to_hex() or change to base36 like below (but for base36, try to not expose it to customer, in order to avoid some funny string showing up!):
PostgreSQL: Is there a function that will convert a base-10 int into a base-36 string?
Check out a blog by Bruce. This gets you part way there. You will have to make sure it doesn't already exist. Maybe concat the primary key to it?
Generating Random Data Via Sql
"Ever need to generate random data? You can easily do it in client applications and server-side functions, but it is possible to generate random data in sql. The following query generates five lines of 40-character-length lowercase alphabetic strings:"
SELECT
(
SELECT string_agg(x, '')
FROM (
SELECT chr(ascii('a') + floor(random() * 26)::integer)
FROM generate_series(1, 40 + b * 0)
) AS y(x)
)
FROM generate_series(1,5) as a(b);
Use primary key in your data. If you really need alphanumeric unique string, you can use base-36 encoding. In PostgreSQL you can use this function.
Example:
select base36_encode(generate_series(1000000000,1000000010));
GJDGXS
GJDGXT
GJDGXU
GJDGXV
GJDGXW
GJDGXX
GJDGXY
GJDGXZ
GJDGY0
GJDGY1
GJDGY2