Does the apache phoenix local index effective immediately after upsert? - apache-phoenix

I have a table created with local index, and then insert some records using jdbc batch update according this document about Writing section, finally i try to retrieve records I created through sqlline.py,but I can't get the results .
create table with local index
CREATE TABLE test_table (
id varchar primary key,
fileName varchar,
md5 varchar,
hash varchar,
) SALT_BUCKETS = 5;
CREATE LOCAL INDEX test_table_index_md5 ON kodo_operation_log (md5);
upsert records though jdbc api
try (Connection conn = DriverManager.getConnection(url)) {
conn.setAutoCommit(false);
int batchSize = 0;
int commitSize = 1000; // number of rows you want to commit per
batch.
try (Statement stmt = conn.prepareStatement(upsert)) {
stmt.set ... while (there are records to upsert) {
stmt.executeUpdate();
batchSize++;
if (batchSize % commitSize == 0) {
conn.commit();
}
}
conn.commit(); // commit the last batch of records
// query the record by index field ,but no records found
}
query the record through sqlline.py or jdbc api
select * from test_table where md5='138fde2b1c2a5bb7c6f7effebeedab76'
but no rows found, my question is the local index doesn't effective immediately after upsert in same java thread?
Environment:
phoenix 5.0.0-HBASE-2.0
HBASE 2.0.0

Related

duplicate key value violates unique constraint "pk_user_governments"

I am trying to insert a record with many to many relationship in EfCore to postgres table
When adding a simple record to Users...it works but when I introduced 1:N with User_Governments
It started giving me duplicate key value violates unique constraint "pk_user_governments"
I have tried a few things:
SELECT MAX(user_governments_id) FROM user_governments;
SELECT nextval('users_gov_user_id_seq');
This keeps incrementing everytime I run it in postgres..but the issue does not go
I am inserting it as follows:
User user = new();
user.Organisation = organisation;
user.Name = userName;
user.Email = email;
user.IsSafetyDashboardUser = isSafetyFlag;
if (isSafetyFlag)
{
List<UserGovernment> userGovernments = new List<UserGovernment>();
foreach (var govId in lgas)
{
userGovernments.Add(new UserGovernment()
{
LocalGovId = govId,
StateId = 7
});
}
user.UserGovernments = userGovernments;
}
_context.Users.Add(user);
int rows_affected = _context.SaveChanges();
Table and column in db is as follows:
CREATE TABLE IF NOT EXISTS user_governments
(
user_government_id integer NOT NULL GENERATED BY DEFAULT AS IDENTITY ( INCREMENT 1 START 1 MINVALUE 1 MAXVALUE 2147483647 CACHE 1 ),
user_id integer NOT NULL,
state_id integer NOT NULL,
local_gov_id integer NOT NULL,
CONSTRAINT pk_user_governments PRIMARY KEY (user_government_id),
CONSTRAINT fk_user_governments_local_govs_local_gov_id FOREIGN KEY (local_gov_id)
REFERENCES local_govs (local_gov_id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE CASCADE,
CONSTRAINT fk_user_governments_states_state_id FOREIGN KEY (state_id)
REFERENCES states (state_id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE CASCADE,
CONSTRAINT fk_user_governments_users_user_id FOREIGN KEY (user_id)
REFERENCES users (user_id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE CASCADE
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
I have also tried running following command as per this post
SELECT SETVAL((SELECT PG_GET_SERIAL_SEQUENCE('user_governments', 'user_government_id')), (SELECT (MAX("user_government_id") + 1) FROM "user_governments"), FALSE);
but I get error:
ERROR: relation "user_governments" does not exist
IDENTITY is an table integrated automatic increment. No needs to use PG_GET_SERIAL_SEQUENCE wich is dedicated for SEQUENCES that is another way to have increment outside the table. So you cannot use a query like :
SELECT SETVAL((SELECT PG_GET_SERIAL_SEQUENCE('user_governments', 'user_government_id')),
(SELECT (MAX("user_government_id") + 1) FROM "user_governments"), FALSE)
If your purpose is to assigne the seed for an IDENTITY, the ways to do that is :
You must use a syntax like this one :
ALTER TABLE user_governments
ALTER COLUMN user_government_id RESTART WITH (select Max(user_government_id) + 1 from user_governments);
It turned out that I did not build the model correctly.
The user_government table had an incremental key, but I had defined the model as follows
modelBuilder.Entity<UserGovernment>()
.HasKey(bc => new { bc.UserId, bc.LocalGovId });
I replaced it with:
modelBuilder.Entity<UserGovernment>()
.HasKey(bc => new { bc.UserGovernmentId});
The Journey :)
Initially I found out that once I commented the following line
_context.UserGovernments.AddRange(userGovernments);
It just inserted data with user_government_id as 0
Then I tried manually giving a value to user_government_id and it also went successfully, this lead me to check my modelbuilder code!!

DB2 Update statement not working using JDBC

I have a few rows stored in a source table (as defined as $schema.$sourceTable in the UPDATE query below). This table has 3 columns: TABLE_NAME, PERMISSION_TAG_COL, PT_DEPLOYED
I have an update statement stored in a string like:
var update_PT_Deploy = s"UPDATE $schema.$sourceTable SET PT_DEPLOYED = 'Y' WHERE TABLE_NAME = '$tableName';"
My source table does have rows with TABLE_NAME as $tableName (parameter) as I inserted rows into this table using another function of my program. The default value of PT_DEPLOYED when I inserted the rows was specified as NULL.
I'm trying to execute update using JDBC in the following manner:
println(update_PT_Deploy)
val preparedStatement: PreparedStatement = connection.prepareStatement(update_PT_Deploy)
val row = preparedStatement.execute()
println(row)
println("row updated in table successfully")
preparedStatement.close()
The above piece of code does not throw any exception, but when I query my table in a tool like DBeaver, the NULL value of PT_DEPLOYED does not get updated to Y.
If I execute the same query as mentioned in update_PT_Deploy inside DBeaver, the query works and the table updates. I am sure I am following the correct steps..

How to config hibernate isolation level for postgres

I have a table ErrorCase in postgres database. This table has one field case_id with datatype text. Its value is generated by format: yymmdd_xxxx. yymmdd is the date when the record insert to DB, xxxx is the number of record in that date.
For example, 3th error case on 2019/08/01 will have the case_id = 190801_0003. On 08/04, if there is one more case, its case_id will be 190804_0001, and go on.
I already using trigger in database to generate value for this field:
DECLARE
total integer;
BEGIN
SELECT (COUNT(*) + 1) INTO total FROM public.ErrorCase WHERE create_at = current_date;
IF (NEW.case_id is null) THEN
NEW.case_id = to_char(current_timestamp, 'YYMMDD_') || trim(to_char(total, '0000'));
END IF;
RETURN NEW;
END
And in Spring Project, I config the application properties for jpa/hibernates:
datasource:
type: com.zaxxer.hikari.HikariDataSource
url: jdbc:postgresql://localhost:5432/table_name
username: postgres
password: postgres
hikari:
poolName: Hikari
auto-commit: false
jpa:
database-platform: io.github.jhipster.domain.util.FixedPostgreSQL82Dialect
database: POSTGRESQL
show-sql: true
properties:
hibernate.id.new_generator_mappings: true
hibernate.connection.provider_disables_autocommit: true
hibernate.cache.use_second_level_cache: true
hibernate.cache.use_query_cache: false
hibernate.generate_statistics: true
Currently, it generates the case_id correctly.
However, when insert many records in nearly same time, it generates the same case_id for two record. I guess the reason is because of the isolation level. When the first transaction not yet committed, the second transaction do the SELECT query to build case_id. So, the result of SELECT query does not include the record from first query (because it has not committed yet). Therefore, the second case_id has the same result as the first one.
Please suggest me any solution for this problems, which isolation level is good for this case???
"yymmdd is the date when the record insert to DB, xxxx is the number of record in that date" - no offense but that is a horrible design.
You should have two separate columns, one date column and one integer column. If you want to increment the counter during an insert, make that date column the primary key and use insert on conflict. You can get rid that horribly inefficient trigger and more importantly that will be safe for concurrent modifications even with read committed.
Something like:
create table error_case
(
error_date date not null primary key,
counter integer not null default 1
);
Then use the following to insert rows:
insert into error_case (error_date)
values (date '2019-08-01')
on conflict (error_date) do update
set counter = counter + 1;
No trigger needed and safe for concurrent inserts.
If you really need a text column as a "case ID", create a view that returns that format:
create view v_error_case
as
select concat(to_char(error_date, 'yymmdd'), '_', to_char(counter, '0000')) as case_id,
... other columns
from error_case;

How can i ignore error in an update query?

I am running an update query like
update datavalue
set categoryoptioncomboid = '21519'
where dataelementid = '577' and
categoryoptioncomboid = '471';
but it is giving an error
ERROR: duplicate key value violates unique constraint "datavalue_pkey"
DETAIL: Key (dataelementid, periodid, sourceid, categoryoptioncomboid, attributeoptioncomboid)=(577, 35538, 10299, 21519, 15) already exists.
Is there a way to make postgres continue updating and skip any errors? Is there a way without using procedure for loop?
I'd try something like this:
update datavalue
set categoryoptioncomboid = '21519'
where
dataelementid = '577' and categoryoptioncomboid = '471'
and not exists (
select 1
from datavalue dv
where dv.dataelementid=datavalue.dataelementid
and dv.periodid=datavalue.periodid
and dv.sourceid=datavalue.sourceid
and dv.categoryoptioncomboid='21519'
and dv.attributeoptioncomboid=datavalue.attributeoptioncomboid
);
Another idea is to insert with on conflict and then delete unneeded rows. But it requires knowledge of the full definition of datavalue table columns.

Go: How to get last insert id on Postgresql with NamedExec()

I use jmoiron/sqlx library for communicating with my PostgreSql server in my Go apps. Somewhere on my apps i have this following code:
sqlQuery := `
INSERT INTO table_to_insert (
code,
status,
create_time,
create_by
) VALUES (
'',
0,
CURRENT_TIMESTAMP,
0
) RETURNING id
`
datas, err := tx.NamedExec(sqlQuery, structToInsert)
Question: how can i get the last insert id using the return from tx.NamedExec()? I've tried datas.LastInsertId() but its always return 0.
Note: im sure the insert to postgres is success.
The reason for this is because PostgreSQL does not return you the last inserted id. This is because last inserted id is available only if you create a new row in a table that uses a sequence.
If you actually insert a row in the table where a sequence is assigned, you have to use RETURNING clause. Something like this: INSERT INTO table (name) VALUES("val") RETURNING id".
I am not sure about your driver, but in pq you will do this in the following way:
lastInsertId := 0
err = db.QueryRow("INSERT INTO brands (name) VALUES($1) RETURNING id", name).Scan(&lastInsertId)
resp.LastInsertID() only (typically) works with mySQL, and only works for integer IDs: https://golang.org/pkg/database/sql/#Result
Note that since you're using sqlx (by the use of NamedExec) you'll want to instead use tx.Get to exec the query and capture the return value:
// id should match the type of your ID
// e.g. int64 for a bigserial column, or string for a uuid
var id string
resp, err := tx.Get(&id, query, v1, v2, v3)
See this relevant discussion on the sqlx GitHub repo: https://github.com/jmoiron/sqlx/issues/154#issuecomment-148216948