What is the method to cache results using JPA when queried by a non-primary key? - jpa

I have a JPA entity EntityA mapped to a table TABLE_A with data like:
OBJ_ID | REF_ID | STATUS | ... where OBJ_ID is the primary key and REF_ID is a foreign key from another table TABLE_B which is not managed by JPA.
I need to be able to:
Return all rows which relate to a given value of a non-primary key, say REF_ID:
SELECT * FROM TABLE_A WHERE REF_ID = 'REF_ID_1'
Cache results from (1), so I do not hit the database until a write has been performed.
During write (UPDATE / DELETE) within a transaction, read back all rows including un-committed data as in (1):
Code:
fetchAllByRefId('REF_ID_1');
// Fires: SELECT * FROM TABLE_A WHERE REF_ID = 'REF_ID_1'
// Let's assume this returns 1 row
Transaction {
update('REF_ID_1', 'REF_ID_2');
// Fires: UPDATE TABLE_A SET REF_ID = 'REF_ID_2' WHERE REF_ID = 'REF_ID_1'
fetchAllByRefId('REF_ID_1');
// Fires: SELECT * FROM TABLE_A WHERE REF_ID = 'REF_ID_1'
// This should return 0 rows
}
fetchAllByRefId('REF_ID_1');
// Fires: SELECT * FROM TABLE_A WHERE REF_ID = 'REF_ID_1'
// If the transaction gets committed, this should return 0 rows
// If rolled back, should return 1 row
I tried to use Eclipselink's implementation of JPA to map the entity against TABLE_A.
I am using the object (L2) cache to cache by primary key, and using the query cache to cache results queried by a non-primary key field such as REF_ID or STATUS.
From the documentation of Query Results cache:
The query results cache does not pick up committed changes from the
application as the object cache does. It should only be used to cache
read-only objects, or should use an invalidation policy to avoid
caching stale results. Committed changes to the objects in the result
set will still be picked up, but changes that affect the results set
(such as new or changed objects that should be added/removed from the
result set) will not be picked up.
Further, if the query cache is not the correct thing to use for caching the results, what is the way to cache results fetched by a non-primary key field?
I was not able to find an equivalent to Spring Data's findAll(Example<S>) where I can specify a predicate, in the EntityManager API - everything seems to require the primary key. I have tried CriteriaQuery, but was unable to cache its results.

A query cache, as the doc states, is a query result cache storing the results the database returned. These results are not entities so they aren't maintained within a UnitOfWork or EntityManager context; once read in, they are unchanged until they are cleared. That isn't to say you will get stale data back; there is a separate entity cache that still gets used that you may need to configure (it is on by default). What that means is that if you execute a SELECT * FROM TABLE_A WHERE REF_ID = 'REF_ID_1' query, it will cache and return the rows that match up instead of hitting the database, but use the Entity cache to return the entity objects. So it will return all TableA instances that had a Ref_id = 'REF_ID_1' as of the time the query was cached, but if the entities exist in the entity cache, may show a different ref_id as well as any other state that might have since changed. This would also mean it might exclude new instances or include deleted ones if the query result is very stale; as of EclipseLink 2.5, EclipseLink can invalidate query cache results based on the changes involved, which will cause your query to refresh from the database on the next execution.
Your issue though seems to be with how query caches must be used. EclipseLink only supports query caches through named queries. See the answer here for details on how to add named queries. In Spring, they are looked up using the method name first before it tries to generate a query.
So a "TableA.fetchAllByRefId" named query would get used on a TableARepository class method:
List<TableA> fetchAllByRefId(#Param("REF_ID") String refId);

Related

PostgreSQL: Return auto-generated ids from COPY FROM insertion

I have a non-empty PostgreSQL table with a GENERATED ALWAYS AS IDENTITY column id. I do a bulk insert with the C++ binding pqxx::stream_to, which I'm assuming uses COPY FROM. My problem is that I want to know the ids of the newly created rows, but COPY FROM has no RETURNING clause. I see several possible solutions, but I'm not sure if any of them is good, or which one is the least bad:
Provide the ids manually through COPY FROM, taking care to give the values which the identity sequence would have provided, then afterwards synchronize the sequence with setval(...).
First stream the data to a temp-table with a custom index column for ordering. Then do something likeINSERT INTO foo (col1, col2)
SELECT ttFoo.col1, ttFoo.col2 FROM ttFoo
ORDER BY ttFoo.idx RETURNING foo.id
and depend on the fact that the identity sequence produces ascending numbers to correlate them with ttFoo.idx (I cannot do RETURNING ttFoo.idx too because only the inserted row is available for that which doesn't contain idx)
Query the current value of the identity sequence prior to insertion, then check afterwards which rows are new.
I would assume that this is a common situation, yet I don't see an obviously correct solution. What do you recommend?
You can find out which rows have been affected by your current transaction using the system columns. The xmin column contains the ID of the inserting transaction, so to return the id values you just copied, you could:
BEGIN;
COPY foo(col1,col2) FROM STDIN;
SELECT id FROM foo
WHERE xmin::text = (txid_current() % (2^32)::bigint)::text
ORDER BY id;
COMMIT;
The WHERE clause comes from this answer, which explains the reasoning behind it.
I don't think there's any way to optimise this with an index, so it might be too slow on a large table. If so, I think your second option would be the way to go, i.e. stream into a temp table and INSERT ... RETURNING.
I think you can create id with type is uuid.
The first step, you should random your ids after that bulk insert them, by this way your will not need to return ids from database.

Should I use FOR UPDATE in SELECT subquery here DELETE FROM table WHERE id = any(array(SELECT id FROM table WHERE ... LIMIT 100))

I know if I use ctid I should use FOR UPDATE in sub-query because row can be updated by another transaction while my transaction tries to delete it. As result this row will not be deleted. The right way:
DELETE FROM table WHERE ctid = any(array(
SELECT ctid
FROM table
WHERE ...
LIMIT 100
FOR UPDATE));
If I use primary key same way should I need to use FOR UPDATE in SELECT sub-query? If not, why not?
DELETE FROM table WHERE id = any(array(
SELECT id
FROM table
WHERE ...
LIMIT 100
FOR UPDATE));
The same could happen with the primary key, although I'd expect it to happen less often (primary keys should not change).
But you need that FOR UPDATE not only because the row could be modified: without it, the subquery would also see rows that are being deleted by a concurrent statement, and which will prove non-existent when you try to delete them.
Finally, it would be a good thing to have an ORDER BY in the subquery that can use an index. Then all such queries will try to lock rows in the same order, which reduces the likelihood of deadlocks.

How to get return value from insert query defined in #query annotation

I have two tables city and shape, where city has composite primary key as(City and country) and one auto generated Id value. This Id is foreign key for Shape.
So after insertion of data in city table, I want Id that was inserted to be used in shape table. I have used Spring boot with jpa and postgres
In CityRepository I have custom save method which will do nothing on conflict.
I have tried below code to get returned value. But I get Error
SqlExceptionHelper : A result was returned when none was expected.
How to get the returning value from insert query?
#Modifying
#Query(value="insert into public.city_info(city, country) values(:city,:country) on conflict do nothing returning city_id",nativeQuery = true)
#Transactional
Integer save(#Param("city")String city,#Param("country") String country);
I'm afraid but that's not possible with JPA.
If you look at the JPA Query API:
int executeUpdate()
Execute an update or delete statement.
Returns: the number of entities updated or deleted
There is no other return value possible.

Insert after delete same transaction in Spring Data JPA

Using Spring Data JPA I have the next flow inside the same transaction (REQUIRES_NEW) :
Remove a set of user's predictions with this Spring Data JPA repository method.
#Query(value = "DELETE FROM TRespuestaUsuarioPrediccion resp WHERE resp.idEvento.id = :eventId AND resp.idUsuario.id = :userId")
#Modifying
void deleteUserPredictions(#Param("userId") int userId, #Param("eventId") int eventId);
Insert the new user's predictions and save the master object (event).
eventRepository.save(event);
When this service finishes, the commit is made by AOP but only works in first attemp...not in the next ones...
How can I manage this situation without iterating over event's predictions entries and updating each one inside?
UPDATE
I tried with that and it doesn't work (the adapter inserts the objects I remove before):
#Transactional(propagation=Propagation.REQUIRES_NEW, rollbackFor=PlayTheGuruException.class)
private void updateUserPredictions(final TUsuario user, final TEvento event, final SubmitParticipationRequestDTO eventParticipationRequestDTO)
{
eventRepository.deleteUserPredictions(user.getId(), event.getId());
EventAdapter.predictionParticipationDto2Model(user, event, eventParticipationRequestDTO);
eventRepository.save(event);
}
Hibernate changed order of the commands. It works in below order :
Execute all SQL and second-level cache updates, in a special order so that foreign-key constraints cannot be violated:
1. Inserts, in the order they were performed
2. Updates
3. Deletion of collection elements
4. Insertion of collection elements
5. Deletes, in the order they were performed
And that is exactly the case. When flushing, Hibernate executes all inserts before delete statements.
The possible option are :
1. To call entityManager.flush() explicitly just after the delete.
OR
2. Wherever possible update existing rows and from rest create ToBeDeleted List. This will ensure that existing records are updated with new values and completely new records are saved.
PostgreSQL (and maybe other databases as well) have the possibility to defer the constraint until the commit. Meaning that it accepts duplicates in the transaction, but enforces the unique constraint when committing.
ALTER TABLE <table name> ADD CONSTRAINT <constraint name> UNIQUE(<column1>, <column2>, ...) DEFERRABLE INITIALLY DEFERRED;

SQL Merge Query - Executing Additional Query

I have written a working T-SQL MERGE statement. The premise is that Database A contains records about customer's support calls. If they are returning a product for repair, Database B is to be populated with certain data elements from Database A (e.g. customer name, address, product ID, serial number, etc.) So I will run an SQL Server job that executes an SSIS package every half hour or so, in which the MERGE will do one of the following:
If the support call in Database A requires a product return and it
is not in Database B, INSERT it into Database B..
If the support call in Database A requires a product return and it
is in Database B - but data has changed - UPDATE it in Database B.
If there is a product return in Database B but it is no longer
indicated as a product return in Database A (yes, this can happen - a customer can change their mind at a later time/date and not want to pay for a replacement product), DELETE it from Database
B.
My problem is that Database B has an additional table with a 1-to-many FK relationship with the table being populated in the MERGE. I do not know how, or even if, I can go about using a MERGE statement to first delete the records in the table with FK constraint before deleting the records as I am currently doing in my MERGE statement.
Obviously, one way would be to get rid of the DELETE in the MERGE and hack out writing IDs to delete in a temp table, then deleting from the FK table, then the PK table. But if I can somehow delete from both tables in WHEN NOT MATCHED BY SOURCE that would be cleaner code. Can this be done?
You can only UPDATE, DELETE, or INSERT into/from one table per query.
However, if you added an ON DELETE CASCADE to the FK relationship, the sub-table would be cleaned up as you delete from the primary table, and it would be handled in a single operation.