Creating a non-numeric Sequence in Postgres - postgresql

I've come across a requirement to create a sequence in Postgres for generating a code (in string) which is expected to generate a unique code to increment by one for each new row and it should follow a six digit pattern.
For instance,
AC0001
AC0040
AC0201
AC3421
where the first two letters are chars and the remaining are integers.
I have created a sequence first,
CREATE SEQUENCE code_sequence START WITH 1
INCREMENT BY 1
CACHE 1;
Then, created a table,
CREATE TABLE account
(
code VARCHAR NOT NULL DEFAULT 'AC'||nextval('code_sequence'::regclass)::VARCHAR,
desc VARCHAR
);
This generates the code as AC1, AC2 etc. But, I want to have the code like AC0001, AC0002. Trying to "pad" zero's just after the 'AC'.
I would appreciate, if any one suggest a solution or idea for this problem.

Use to_char() to format the number:
CREATE TABLE account
(
code VARCHAR NOT NULL DEFAULT 'AC'||to_char(nextval('code_sequence'), 'FM0000'),
"desc" VARCHAR
);

Try the LPAD function.
CREATE TABLE account
(
code VARCHAR NOT NULL DEFAULT 'AC' || LPAD(nextval('code_sequence'::regclass), 4, '0')::VARCHAR,
desc VARCHAR
);

Related

Postgresql drops trailing zeroes when loading time with milliseconds from csv

I am importing a csv file into a Postgres Table. The file has the following format:
2019/12/13, 14:56:02, 3172.50, 3174.25, 3172.50, 3172.50, 1, 1, 1, 0
The table is defined as:
CREATE TABLE tablename (
date date,
time time,
v1 numeric,
v2 numeric,
v3 numeric,
v4 numeric,
v5 integer,
v6 integer,
v6 integer,
v7 integer,
PRIMARY KEY(date, time)
);
There is an issue with the time field. In some cases, milliseconds are added for precision:
14:56:02.1
14:56:02.9
14:56:02.10
Unfortunately, Postgres seems to drop the trailing zero, which causes it to mark below two values as duplicates:
14:56:02.1
14:56:02.10
ERROR: duplicate key value violates unique constraint "tablename_pkey"
DETAIL: Key (date, "time")=(2019-12-13, 14:56:02.1) already exists.
CONTEXT: COPY input_file, line 1584
Is there a way to instruct psql not to drop trailing zeroes? I tried time(4) to enforce 4 digit precision, with no difference.
Thanks!
Postgres is not doing anything wrong here. It took me a moment to realize that the issue is with the data.
.1 and .10 are equal. In the data, the timestamp was used creatively, i.e. in this case .1 means "1st record within this second" and .10 means "10th record within this second", so the millisecond component didn't make sense from timestamp's point of view.

How to store string spaces as null in numeric column

I want to get records from my local txt file to postgresql table.
I have created following table.
create table player_info
(
Name varchar(20),
City varchar(30),
State varchar(30),
DateOfTour date,
pay numeric(5),
flag char
)
And, my local txt file contains following data.
John|Mumbai| |20170203|55555|Y
David|Mumbai| |20170305| |N
Romcy|Mumbai| |20170405|55555|N
Gotry|Mumbai| |20170708| |Y
I am just executing this,
copy player_info (Name,
City,
State,
DateOfTour,
pay_id,
flag)
from local 'D:\sample_player_info.txt'
delimiter '|' null as ''
exceptions 'D:\Logs\player_info'
What I want is,
For my numeric column, If 3 spaces are there,
then I have to insert NULL as pay else whatever 5 digits numeric number.
pay is a column in my table whose datatype is numeric.
Is this correct or possible to do this ?
You cannot store strings in a numeric column, at all. 3 spaces is a string, so it cannot be stored in the column pay as that is defined as numeric.
A common approach to this conundrum is to create a staging table which uses less precise data types in the column definitions. Import the source data into the staging table. Then process that data so that it can be reliably added to the final table. e.g. in the staging table set a column called pay_str to NULL where pay_str = ' ' (or perhaps LIKE ' %')

How does Redshift treat guillemets?

I am trying to run a CSV import using the COPY command for some data that includes a guillemet (»). Redshift complains that the column value is too long for the varchar column I have defined. The error in the "Loads" tab in the Redshift GUI displays this character as two dots: .. - had it been treated as one, it would have fit in the varchar column. It's not clear whether there is some sort of conversion error occurring or if there is a display issue.
When trying to do plain INSERTs I run into strange behavior as well:
dev=# create table test (name varchar(3));
CREATE TABLE
dev=# insert into test values ('bla');
INSERT 0 1
3 characters treated as 4?
dev=# insert into test values ('bl»');
ERROR: value too long for type character varying(3)
dev=# insert into test values ('b»');
INSERT 0 1
Why does char_length return 2?
dev=# select char_length(name), name from test;
char_length | name
-------------+------
2 | b»
I've checked the client encoding and database encodings and those all seem to be UTF8/UNICODE.
You need to increase the length of your varchar field. Multibyte characters use more than one character and length in the definition of varchar field are byte based. So, your special char might be taking more than a byte. If it still doesn't work refer to the doc page for Redshift below,
http://docs.aws.amazon.com/redshift/latest/dg/multi-byte-character-load-errors.html

How can I generate a unique string per record in a table in Postgres?

Say I have a table like posts, which has typical columns like id, body, created_at. I'd like to generate a unique string with the creation of each post, for use in something like a url shortener. So maybe a 10-character alphanumeric string. It needs to be unique within the table, just like a primary key.
Ideally there would be a way for Postgres to handle both of these concerns:
generate the string
ensure its uniqueness
And they must go hand-in-hand, because my goal is to not have to worry about any uniqueness-enforcing code in my application.
I don't claim the following is efficient, but it is how we have done this sort of thing in the past.
CREATE FUNCTION make_uid() RETURNS text AS $$
DECLARE
new_uid text;
done bool;
BEGIN
done := false;
WHILE NOT done LOOP
new_uid := md5(''||now()::text||random()::text);
done := NOT exists(SELECT 1 FROM my_table WHERE uid=new_uid);
END LOOP;
RETURN new_uid;
END;
$$ LANGUAGE PLPGSQL VOLATILE;
make_uid() can be used as the default for a column in my_table. Something like:
ALTER TABLE my_table ADD COLUMN uid text NOT NULL DEFAULT make_uid();
md5(''||now()::text||random()::text) can be adjusted to taste. You could consider encode(...,'base64') except some of the characters used in base-64 are not URL friendly.
All existing answers are WRONG because they are based on SELECT while generating unique index per table record. Let us assume that we need unique code per record while inserting: Imagine two concurrent INSERTs are happening same time by miracle (which happens very often than you think) for both inserts same code was generated because at the moment of SELECT that code did not exist in table. One instance will INSERT and other will fail.
First let us create table with code field and add unique index
CREATE TABLE my_table
(
code TEXT NOT NULL
);
CREATE UNIQUE INDEX ON my_table (lower(code));
Then we should have function or procedure (you can use code inside for trigger also) where we 1. generate new code, 2. try to insert new record with new code and 3. if insert fails try again from step 1
CREATE OR REPLACE PROCEDURE my_table_insert()
AS $$
DECLARE
new_code TEXT;
BEGIN
LOOP
new_code := LOWER(SUBSTRING(MD5(''||NOW()::TEXT||RANDOM()::TEXT) FOR 8));
BEGIN
INSERT INTO my_table (code) VALUES (new_code);
EXIT;
EXCEPTION WHEN unique_violation THEN
END;
END LOOP;
END;
$$ LANGUAGE PLPGSQL;
This is guaranteed error free solution not like other solutions on this thread
Use a Feistel network. This technique works efficiently to generate unique random-looking strings in constant time without any collision.
For a version with about 2 billion possible strings (2^31) of 6 letters, see this answer.
For a 63 bits version based on bigint (9223372036854775808 distinct possible values), see this other answer.
You may change the round function as explained in the first answer to introduce a secret element to have your own series of strings (not guessable).
The easiest way probably to use the sequence to guarantee uniqueness
(so after the seq add a fix x digit random number):
CREATE SEQUENCE test_seq;
CREATE TABLE test_table (
id bigint NOT NULL DEFAULT (nextval('test_seq')::text || (LPAD(floor(random()*100000000)::text, 8, '0')))::bigint,
txt TEXT
);
insert into test_table (txt) values ('1');
insert into test_table (txt) values ('2');
select id, txt from test_table;
However this will waste a huge amount of records. (Note: the max bigInt is 9223372036854775807 if you use 8 digit random number at the end, you can only have 922337203 records. Thou 8 digit is probably not necessary. Also check the max number for your programming environment!)
Alternatively you can use varchar for the id and even convert the above number with to_hex() or change to base36 like below (but for base36, try to not expose it to customer, in order to avoid some funny string showing up!):
PostgreSQL: Is there a function that will convert a base-10 int into a base-36 string?
Check out a blog by Bruce. This gets you part way there. You will have to make sure it doesn't already exist. Maybe concat the primary key to it?
Generating Random Data Via Sql
"Ever need to generate random data? You can easily do it in client applications and server-side functions, but it is possible to generate random data in sql. The following query generates five lines of 40-character-length lowercase alphabetic strings:"
SELECT
(
SELECT string_agg(x, '')
FROM (
SELECT chr(ascii('a') + floor(random() * 26)::integer)
FROM generate_series(1, 40 + b * 0)
) AS y(x)
)
FROM generate_series(1,5) as a(b);
Use primary key in your data. If you really need alphanumeric unique string, you can use base-36 encoding. In PostgreSQL you can use this function.
Example:
select base36_encode(generate_series(1000000000,1000000010));
GJDGXS
GJDGXT
GJDGXU
GJDGXV
GJDGXW
GJDGXX
GJDGXY
GJDGXZ
GJDGY0
GJDGY1
GJDGY2

postgres : create a sequence like 0000001 to 00000n

Dear All,
I want to create a sequence in postgres which is like 0000000001 to 00nnnnnnnnn
normally we can create from 1 to n , but I want that to be preceeded with 0's
Is there any easy way to do this ???
sequence is number generator, a number doesn't have the '0' left padding...
if you want to add padding you can use the lpad function:
CREATE SEQUENCE my_sequence_seq;
SELECT lpad(nextval('my_sequence_seq')::text,10,'0');
you can use it also in the table declaration:
CREATE TABLE sequence_test(
id varchar(20) NOT NULL DEFAULT lpad(nextval('my_sequence_seq')::text,10,'0'),
name text
);
PostgreSQL sequences can only return INTEGER or BIGINT values, so normally numbers 1 and 0000001 are considered the same, 1 being canonical representation.
I am not sure why would you want to do this, but you can convert sequence number to string and prepend appropriate number of 0 characters, something like this:
SELECT repeat('0', 7 - length(nextval('myseq'))) || currval('myseq')::text
Where 7 is total number of digits you need (code may stop working if number is beyond that number of digits).
Note that you will need to create sequence myseq as source for your numbers:
CREATE SEQUENCE myseq;