How to config hibernate isolation level for postgres - postgresql

I have a table ErrorCase in postgres database. This table has one field case_id with datatype text. Its value is generated by format: yymmdd_xxxx. yymmdd is the date when the record insert to DB, xxxx is the number of record in that date.
For example, 3th error case on 2019/08/01 will have the case_id = 190801_0003. On 08/04, if there is one more case, its case_id will be 190804_0001, and go on.
I already using trigger in database to generate value for this field:
DECLARE
total integer;
BEGIN
SELECT (COUNT(*) + 1) INTO total FROM public.ErrorCase WHERE create_at = current_date;
IF (NEW.case_id is null) THEN
NEW.case_id = to_char(current_timestamp, 'YYMMDD_') || trim(to_char(total, '0000'));
END IF;
RETURN NEW;
END
And in Spring Project, I config the application properties for jpa/hibernates:
datasource:
type: com.zaxxer.hikari.HikariDataSource
url: jdbc:postgresql://localhost:5432/table_name
username: postgres
password: postgres
hikari:
poolName: Hikari
auto-commit: false
jpa:
database-platform: io.github.jhipster.domain.util.FixedPostgreSQL82Dialect
database: POSTGRESQL
show-sql: true
properties:
hibernate.id.new_generator_mappings: true
hibernate.connection.provider_disables_autocommit: true
hibernate.cache.use_second_level_cache: true
hibernate.cache.use_query_cache: false
hibernate.generate_statistics: true
Currently, it generates the case_id correctly.
However, when insert many records in nearly same time, it generates the same case_id for two record. I guess the reason is because of the isolation level. When the first transaction not yet committed, the second transaction do the SELECT query to build case_id. So, the result of SELECT query does not include the record from first query (because it has not committed yet). Therefore, the second case_id has the same result as the first one.
Please suggest me any solution for this problems, which isolation level is good for this case???

"yymmdd is the date when the record insert to DB, xxxx is the number of record in that date" - no offense but that is a horrible design.
You should have two separate columns, one date column and one integer column. If you want to increment the counter during an insert, make that date column the primary key and use insert on conflict. You can get rid that horribly inefficient trigger and more importantly that will be safe for concurrent modifications even with read committed.
Something like:
create table error_case
(
error_date date not null primary key,
counter integer not null default 1
);
Then use the following to insert rows:
insert into error_case (error_date)
values (date '2019-08-01')
on conflict (error_date) do update
set counter = counter + 1;
No trigger needed and safe for concurrent inserts.
If you really need a text column as a "case ID", create a view that returns that format:
create view v_error_case
as
select concat(to_char(error_date, 'yymmdd'), '_', to_char(counter, '0000')) as case_id,
... other columns
from error_case;

Related

Inserting records into table1 depending on row value in table2

For each row in table exam 'where exam.examRegulation isnull', I want to insert one corresponding row in table examRegulation and copy columnvalues from exam to examregulation. Apparently the following query ist too naive and must be approved:
insert into examRegulation (graduation, course, examnumber, examversion)
values (exam.graduation, exam.course, exam.examnumber, exam.examversion)
where ?? (select graduation, course, examnumber, examversion
from exam
where exam.examRegulation isnull)
Is there a way to do this in postgresql?
You may rephrase this as an INSERT INTO ... SELECT statement:
INSERT INTO examRegulation (graduation, course, examnumber, examversion)
SELECT graduation, course, examnumber, examversion
FROM exam
WHERE examRegulation IS NULL;
The VALUES clause, as the name implies, can only be used with literal values. If you need to populate an insert using query logic, then you need to use a SELECT clause.

Is postgresl SERIAL guaranteeing no gaps within single insert statement?

Let's start with:
CREATE TABLE "houses" (
"id" serial NOT NULL PRIMARY KEY,
"name" character varying NOT NULL)
Imagine I try to concurrently (!) insert into the table in a single statement multiple records (maybe 10 maybe 1000).
INSERT INTO houses (name) VALUES
('B6717'),
('HG120');
Is it guaranteed that when a single thread inserts X records in a single statement (when in the same time other threads simultaneously try to insert other records to the same table) that those records will have IDs numbered from A to A+X-1 ? Or is it possible A+100 will be taken by thread 1 and A+99 by thread 2?
Inserting 10000 records at once using two PgAdmin connections seems to be enough to prove that serial type does not guarantee continuity within a batch on my PostgreSQL 9.5
DO
$do$
BEGIN
FOR i IN 1..200 LOOP
EXECUTE format('INSERT INTO houses (name) VALUES %s%s;', repeat('(''a' || i || '''),', 9999), '(''a' || i || ''')');
END LOOP;
END
$do$;
Above results in quite frequent overlap between ids belonging to two different batches
SELECT * FROM houses WHERE id BETWEEN 34370435 AND 34370535 ORDER BY id;
34370435;"b29"
34370436;"b29"
34370437;"b29"
34370438;"a100"
34370439;"b29"
34370440;"b29"
34370441;"a100"
...
I thought this was going to be harder to prove but it turns out it is not guaranteed.
I used a ruby script to have 4 threads insert thousands of records simultaneously and checked whether records created by a single statement had gaps in them and they did.
Thread.new do
100.times do |u|
House.import(1000.times.map do |i|
{
tenant: "#{t}-#{u}",
name: i,
}
end)
end
end
end.each(&:join)
House.distinct.pluck(:tenant).all? do |t|
recs = House.where(
tenant: t,
).order('id').to_a
recs.first.id - recs.first.name.to_i == recs.last.id - recs.last.name.to_i
end
Example of the gaps:
[#<House:0x00007fd2341b5e00
id: 177002,
tenant: "0-43",
name: "0",>,
#<House:0x00007fd2341b5c48
id: 177007,
tenant: "0-43",
name: "1">,
...
As you can see the GAP was 5 between first and second rows inserted within the same single INSERT statement.

unique date field postgresql default value

I have a date column which I want to be unique once populated, but want the date field to be ignored if it is not populated.
In MySQL the way this is accomplished is to set the date column to "not null" and give it a default value of '0000-00-00' - this allows all other fields in the unique index to be "checked" even if the date column is not populated yet.
This does not work in PosgreSQL because '0000-00-00' is not a valid date, so you cannot store it in a date field (this makes sense to me).
At first glance, leaving the field nullable seemed like an option, but this creates a problem:
=> create table uniq_test(NUMBER bigint not null, date DATE, UNIQUE(number, date));
CREATE TABLE
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> select * from uniq_test;
number | date
--------+------
1 |
1 |
1 |
1 |
(4 rows)
NULL apparently "isn't equal to itself" and so it does not count towards constraints.
If I add an additional unique constraint only on the number field, it checks only number and not date and so I cannot have two numbers with different dates.
I could select a default date that is a 'valid date' (but outside working scope) to get around this, and could (in fact) get away with that for the current project, but there are actually cases I might be encountering in the next few years where it will not in fact be evident that the date is a non-real date just because it is "a long time ago" or "in the future."
The advantage the '0000-00-00' mechanic had for me was precisely that this date isn't real and therefore indicated a non-populated entry (where 'non-populated' was a valid uniqueness attribute). When I look around for solutions to this on the internet, most of what I find is "just use NULL" and "storing zeros is stupid."
TL;DR
Is there a PostgreSQL best practice for needing to include "not populated" as a possible value in a unique constraint including a date field?
Not clear what you want. This is my guess:
create table uniq_test (number bigint not null, date date);
create unique index i1 on uniq_test (number, date)
where date is not null;
create unique index i2 on uniq_test (number)
where date is null;
There will be an unique constraint for not null dates and another one for null dates effectively turning the (number, date) tuples into distinct values.
Check partial index
It's not a best practice, but you can do it such way:
t=# create table so35(i int, d date);
CREATE TABLE
t=# create unique index i35 on so35(i, coalesce(d,'-infinity'));
CREATE INDEX
t=# insert into so35 (i) select 1;
INSERT 0 1
t=# insert into so35 (i) select 2;
INSERT 0 1
t=# insert into so35 (i) select 2;
ERROR: duplicate key value violates unique constraint "i35"
DETAIL: Key (i, (COALESCE(d, '-infinity'::date)))=(2, -infinity) already exists.
STATEMENT: insert into so35 (i) select 2;

Move column values to hstore in Postgres cannot finish on 15 million rows

I'm trying this query to move some metadata in a hstore attribute
UPDATE media_files
SET metadata = hstore (bb)
FROM
(
SELECT
video_bitrate,
video_codec,
video_resolution,
video_fps,
video_aspect,
video_container,
audio_codec,
audio_bitrate,
audio_sample_rate
FROM
media_files
) AS bb
])
The table has 15 million records, I left the job running for 15 hours and it didn't finish and I'm not able to keep track since the table seems to be locked out for good during the operation.
Is there something I can do to optimize this?
Note: this assumes you have the space on your server for two copies of the table + indexes. Should work on PostgreSQL 9.2+
CREATE TABLE media_files_temp AS
WITH cte AS
(SELECT video_bitrate,
video_codec,
video_resolution,
video_fps,
video_aspect,
video_container,
audio_codec,
audio_bitrate,
audio_sample_rate,
<your default value>::INTEGER as play_count
FROM media_files mf)
SELECT cte.*,
HSTORE('video_bitrate',video_bitrate) ||
HSTORE('video_codec',video_codec) ||
HSTORE('video_resolution',video_resolution) ||
HSTORE('video_fps',video_fps) ||
HSTORE('video_aspect',video_aspect) ||
HSTORE('video_container',video_container) ||
HSTORE('audio_codec',audio_codec) ||
HSTORE('audio_bitrate',audio_bitrate) ||
HSTORE('audio_sample_rate',audio_sample_rate)
as metadata
FROM cte;
[create your indexes]
BEGIN;
ALTER TABLE media_files RENAME TO media_files_orig;
ALTER TABLE media_files_temp RENAME TO media_files;
COMMIT;
-- a new column was requested by the OP
ALTER TABLE media_files ALTER your_new_col SET DEFAULT <something>;
ANALYZE media_files;
A useful answer has already been accepted, but there's something important to point out to future readers.
Your original query had a self-join with no constraining criteria. So - your 15 million row table was receiving 225 trillion updates.

IBM DB2 recreate index on truncated table

After truncating table, and inserting new values in table, auto-increment values are not set to started value 1. When inserting new values it's remember last index-ed value of auto-increment.
Colum in table named: ID
Index: PRIMARY,
Initial Value: 1
Cache size: 1
Increment: 1
[checked on IBM DB2 Control Center]
This query:
TRUNCATE TABLE ".$this->_schema.$table." DROP STORAGE IGNORE DELETE TRIGGERS IMMEDIATE
table is EMPTY.
After INSERT NEW VALUES example: INSERT INTO DB2INST1.db (val) VALUES ('abc') it's INSERT with LAST
ID | val
55 | abc
But it SHOULD BE:
ID | val
1 | abc
I'm guessing here that your question is "how do you restart the IDENTITY sequence?" If that is the case, then you can reset it with the following SQL:
ALTER TABLE <table name> ALTER COLUMN <IDENTITY column> RESTART WITH 1
However, like #Ian said, what you are seeing is the expected behavior of a TRUNCATE.
First select in TABLE SCHEMA WHERE is name of IDENTITY column:
Query 1:
SELECT COLNAME FROM SYSCAT.COLUMNS WHERE TABSCHEMA = 'DB2INST1' AND
TABNAME = 'DB' AND IDENTITY = 'Y'
Then, truncate table and return it's example: ID for altering index:
Query 2:
This ID puts on query for reset and altering index identity:
ALTER TABLE DB2INST1.DB ALTER COLUMN ID RESTART WITH 1
Change ID above returned from Query 1, which returns name of ID to Query 2.
SOLVED!