PostgreSQL getting new id during insert - postgresql

I need to create some customer number on record insert, format is 'A' + 4 digits, based on the ID. So record ID 23 -> A0023 and so on. My solution is currently this:
-- Table
create table t (
id bigserial unique primary key,
x text,
y text
);
-- Insert
insert into t (x, y) select concat('A',lpad((currval(pg_get_serial_sequence('t','id')) + 1)::text, 4, '0')), 'test';
This works perfectly. Now my question is ... is that 'safe', in the sense that currval(seq)+1 is guaranteed the same as the id column will receive? I think it should be locked during statement execution. Is this the correct way to do it or is there any shortcut to access the to-be-created ID directly?

Instead of storing this data, you could just query it each time you needed it, making the whole thing a lot less error-prone:
SELECT id, 'A' + LPAD(id::varchar, 4, '0')
FROM t

Related

PostgreSQL array of data composite update element using where condition

I have a composite type:
CREATE TYPE mydata_t AS
(
user_id integer,
value character(4)
);
Also, I have a table, uses this composite type as an array of mydata_t.
CREATE TABLE tbl
(
id serial NOT NULL,
data_list mydata_t[],
PRIMARY KEY (id)
);
Here I want to update the mydata_t in data_list, where mydata_t.user_id is 100000
But I don't know which array element's user_id is equal to 100000
So I have to make a search first to find the element where its user_id is equal to 100000 ... that's my problem ... I don't know how to make the query .... in fact, I want to update the value of the array element, where it's user_id is equal to 100000 (Also where the id of tbl is for example 1) ... What will be my query?
Something like this (I know it's wrong !!!)
UPDATE "tbl" SET "data_list"[i]."value"='YYYY'
WHERE "id"=1 AND EXISTS (SELECT ROW_NUMBER() OVER() AS i
FROM unnest("data_list") "d" WHERE "d"."user_id"=10000 LIMIT 1)
For example, this is my tbl data:
Row1 => id = 1, data = ARRAY[ROW(5,'YYYY'),ROW(6,'YYYY')]
Row2 => id = 2, data = ARRAY[ROW(10,'YYYY'),ROW(11,'YYYY')]
Now i want to update tbl where id is 2 and set the value of one of the tbl.data elements to 'XXXX' where the user_id of element is equal to 11
In fact, the final result of Row2 will be this:
Row2 => id = 2, data = ARRAY[ROW(10,'YYYY'),ROW(11,'XXXX')]
If you know the value value, you can use the array_replace() function to make the change:
UPDATE tbl
SET data_list = array_replace(data_list, (11, 'YYYY')::mydata_t, (11, 'XXXX')::mydata_t)
WHERE id = 2
If you do not know the value value then the situation becomes more complex:
UPDATE tbl SET data_list = data_arr
FROM (
-- UPDATE doesn't allow aggregate functions so aggregate here
SELECT array_agg(new_data) AS data_arr
FROM (
-- For the id value, get the data_list values that are NOT modified
SELECT (user_id, value)::mydata_t AS new_data
FROM tbl, unnest(data_list)
WHERE id = 2 AND user_id != 11
UNION
-- Add the values to update
VALUES ((11, 'XXXX')::mydata_t)
) x
) y
WHERE id = 2
You should keep in mind, though, that there is an awful lot of work going on in the background that cannot be optimised. The array of mydata_t values has to be examined from start to finish and you cannot use an index on this. Furthermore, updates actually insert a new row in the underlying file on disk and if your array has more than a few entries this will involve substantial work. This gets even more problematic when your arrays are larger than the pagesize of your PostgreSQL server, typically 8kB. All behind the scene so it will work, but at a performance penalty. Even though array_replace sounds like changes are made in-place (and they indeed are in memory), the UPDATE command will write a completely new tuple to disk. So if you have 4,000 array elements that means that at least 40kB of data will have to be read (8 bytes for the mydata_t type on a typical system x 4,000 = 32kB in a TOAST file, plus the main page of the table, 8kB) and then written to disk after the update. A real performance killer.
As #klin pointed out, this design may be more trouble than it is worth. Should you make data_list as table (as I would do), the update query becomes:
UPDATE data_list SET value = 'XXXX'
WHERE id = 2 AND user_id = 11
This will have MUCH better performance, especially if you add the appropriate indexes. You could then still create a view to publish the data in an aggregated form with a custom type if your business logic so requires.

Fast new row insertion if a value of a column depends on previous value in existing row

I have a table cusers with a primary key:
primary key(uid, lid, cnt)
And I try to insert some values into the table:
insert into cusers (uid, lid, cnt, dyn, ts)
values
(A, B, C, (
select C - cnt
from cusers
where uid = A and lid = B
order by ts desc
limit 1
), now())
on conflict do nothing
Quite often (with the possibility of 98%) a row cannot be inserted to cusers because it violates the primary key constraint, so hard select queries do not need to be executed at all. But as I can see PostgreSQL first counts the select query as a result of dyn column and only then rejects row because of uid, lid, cnt violation.
What is the best way to insert rows quickly in such situation?
Another explanation
I have a system where one row depends on another. Here is an example:
(x, x, 2, 2, <timestamp>)
(x, x, 5, 3, <timestamp>)
Two columns contain an absolute value (2 and 5) and relative value (2, 5 - 2). Each time I insert new row it should:
avoid same rows (see primary key constraint)
if new row differs, it should count a difference and put it into the dyn column (so I take the last inserted row for the user according to the timestamp and subtract values).
Another solution I've found is to use returning uid, lid, ts for inserts and get user ids which were really inserted - this is how I know they have differences from existing rows. Then I update inserted values:
update cusers
set dyn = (
select max(cnt) - min(cnt)
from (
select cnt
from cusers
where uid = A and lid = B
order by ts desc
limit 2) Table
)
where uid = A and lid = B and ts = TS
But it is not a fast approach either, as it seeks all over the ts column to find the two last inserted rows for each user. I need a fast insert query as I insert millions of rows at a time (but I do not write duplicates).
What the solution can be? May be I need a new index for this? Thanks in advance.

How to increase sequence only if column's value is different from the previous? PostgreSQL

Exactly as in the topic. For really easy example I have table:
Cars(
TEXT make,
INTEGER value,
TEXT owner
);
With records like this:
('BMW', 100000, 'AcmeCompany')
('BMW', 350000, 'XCompany')
('Lamborgini', 500000, 'YCompany')
('Fiat', 30000, 'ZCompany')
('BMW', 200000, 'XCompany')
and I want to create table with column makeID (don't ask why, it's just a example :P)
Informations(
INTEGER makeID
other_stuff
)
I have sequence on makeID and I have to increment sequence ONLY when I have new value of make in in my SELECT (it's like transtalion from string into integer; if it is possible I don't wanna sha1 or other cryptografic things just a simply sequence!)
INSERT INTO Informations(makeID) (
-- IF I have new value: SELECT nextVal('mysequence')
-- ELSE : SELECT currval('mysequence')
)
I know it should be possible by pgsql function (sort table by 'make' column, then save last value of 'make' into temp variable and do INSERT INTO with check value of my actual 'make' value with value in temp for incrementing sequence variable or not). But there is easier way, just by query? Combination of join, sort and distinct?
Not sure I understand. It would be simply serial with select distinct
create table make (
make_id serial,
make_brand text,
make_owner text
);
insert into make (make_brand, make_owner)
select distinct make, owner
from cars;

use two .nextval in an insert statement

I'm using oracle database and facing a problem where two id_poduct.nextval is creating as error: ORA-00001: unique constraint (SYSTEM.SYS_C004166) violated
It is a primary key. To use all is a requirement. Can I use 2 .nextval in a statement?
insert all
into sale_product values (id_product.nextval, id.currval, 'hello', 123, 1)
into sale_product values (id_product.nextval, id.currval, 'hi', 123, 1)
select * from dual;
insert into sale_product
select id_product.nextval, id.currval, a, b, c
from
(
select 'hello' a, 123 b, 1 c from dual union all
select 'hi' a, 123 b, 1 c from dual
);
This doesn't use the insert all syntax, but it works the same way if you are only inserting into the same table.
The value of id_product.NEXTVAL in the first INSERT is the same as the second INSERT, hence you'll get the unique constraint violation. if you remove the constraint and perform the insert, you'll notice the duplicate values!
The only way is to perform two bulk INSERTS in sequence or to have two seperate sequences with a different range, the latter would require an awful lot of coding and checking.
create table temp(id number ,id2 number);
insert all
into temp values (supplier_seq.nextval, supplier_seq.currval)
into temp values (supplier_seq.nextval, supplier_seq.currval)
select * from dual;
ID ID2
---------- ----------
2 2
2 2
Refrence
The subquery of the multitable insert statement cannot use a sequence
http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_9014.htm#i2080134

an empty row with null-like values in not-null field

I'm using postgresql 9.0 beta 4.
After inserting a lot of data into a partitioned table, i found a weird thing. When I query the table, i can see an empty row with null-like values in 'not-null' fields.
That weird query result is like below.
689th row is empty. The first 3 fields, (stid, d, ticker), are composing primary key. So they should not be null. The query i used is this.
select * from st_daily2 where stid=267408 order by d
I can even do the group by on this data.
select stid, date_trunc('month', d) ym, count(*) from st_daily2
where stid=267408 group by stid, date_trunc('month', d)
The 'group by' results still has the empty row.
The 1st row is empty.
But if i query where 'stid' or 'd' is null, then it returns nothing.
Is this a bug of postgresql 9b4? Or some data corruption?
EDIT :
I added my table definition.
CREATE TABLE st_daily
(
stid integer NOT NULL,
d date NOT NULL,
ticker character varying(15) NOT NULL,
mp integer NOT NULL,
settlep double precision NOT NULL,
prft integer NOT NULL,
atr20 double precision NOT NULL,
upd timestamp with time zone,
ntrds double precision
)
WITH (
OIDS=FALSE
);
CREATE TABLE st_daily2
(
CONSTRAINT st_daily2_pk PRIMARY KEY (stid, d, ticker),
CONSTRAINT st_daily2_strgs_fk FOREIGN KEY (stid)
REFERENCES strgs (stid) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE,
CONSTRAINT st_daily2_ck CHECK (stid >= 200000 AND stid < 300000)
)
INHERITS (st_daily)
WITH (
OIDS=FALSE
);
The data in this table is simulation results. Multithreaded multiple simulation engines written in c# insert data into the database using Npgsql.
psql also shows the empty row.
You'd better leave a posting at http://www.postgresql.org/support/submitbug
Some questions:
Could you show use the table
definitions and constraints for the
partions?
How did you load your data?
You get the same result when using
another tool, like psql?
The answer to your problem may very well lie in your first sentence:
I'm using postgresql 9.0 beta 4.
Why would you do that? Upgrade to a stable release. Preferably the latest point-release of the current version.
This is 9.1.4 as of today.
I got to the same point: "what in the heck is that blank value?"
No, it's not a NULL, it's a -infinity.
To filter for such a row use:
WHERE
case when mytestcolumn = '-infinity'::timestamp or
mytestcolumn = 'infinity'::timestamp
then NULL else mytestcolumn end IS NULL
instead of:
WHERE mytestcolumn IS NULL