How to avoid a JPQL query to execute to many SQL query? - spring-data-jpa

I have two objects Client and Procedure. A client can have a list of Procedure and a procedure can only be linked to one Client.
In the Client, I have a transient attribute nbProcedure, that exists to store the number of procedure fo a Client.
I use Spring data JPA, with the following Query :
#Query("SELECT new Client(c, count(p))
FROM Procedure p
INNER JOIN p.client c
WHERE c.userId = ?1
GROUP by c.id")
fun getByUserIdOrderByNameWithNbProcedure(userId: String): List<Client>
I see in the log that this query is executed but after that there is one query executed per row to select all properties of the Client.
How can I avoid all the queries by row and only keep one query executed?
I think, I miss a configuration or a misuse of new Client(c, count(p))

This can be because of not initialized lazy fetch associations in the entity. Can use JOIN FETCH instead of JOIN to avoid multiple queries.

Related

SQL Tables with no direct relation

I have a PostgreSQL DB and I need to build a query to retrieve the information from a table that has no direct link with the main one.
The client is linked to the client_identity_document through its UUID and client_identity_document to the identity_document through the uuid_identity_document. I know I have to make an inner join, but I just started with relational databases and I don't know exactly the syntax to join tables that don't have direct relation.
try this
select *
from client c
inner join client_identity_document cid on c.UUID = cid.UUID_Client
inner join identity_document id on cid.uuid_identity_document = id.UUID

SSIS Exporting from one DB to another using parameters in the destination

I have data on table T1 in database DB1.
I need move to the table T2 in database DB2 only those rows of T1 that have Field F1 equals to certain values.
These values are on a table TC in database DB2. DB1 cannot refer to DB2 therefore in the DataFlow Component of my dtsx I cannot write a query with a join between T1 and TC.
I see two possible paths:
I could first import all the rows from T1 then filter them in the dtsx before pouring them in T2
Instead of having a SQL query to get the data from DB1 I could write a stored procedure in DB1 that accepts a table valued parameter and then I could try (I don't know how) to put my parameters (1,2,4 in the example) in the TVP and invoke the stored procedure with this.
I have to do this kind of import for dozens of tables therefore solution 2 seems really too convoluted and complicated. Solution 1 seems to do too much useless work, first importing everything and then discarding part of what was imported.
Is there a best practice or smart trick in this case?
Thank you
You can use a merge join to join data from T1 and Tc on field F1. So two data source, one for T1, one for TC. No parameters are applied yet. You need to use a sort component to sort on the join field (F1) in both result sets for the merge join to work. Then define a join type (inner) in merge join component. This is where the parameters from TC are applied to T1, so you use a inner join to apply the parameters. Finally export the result to T2.
Another way is just import everything from T1 to a temp table on DB2, call it T2_temp, then you can use a query to join T2_temp with TC on F1, then insert the result to T2.

IN clause with large list in OpenJpa causing too complex statement

I have to create a named query where I need to group my results by some fields and also using an IN clause to limit my results.
The it looks something like this
SELECT new MyDTO(e.objID) FROM Entity e WHERE e.objId IN (:listOfIDs) GROUP BY e.attr1, e.attr2
I'm using OpenJPA and IBM DB2. In some cases my List of IDs can be very large (>80.000 IDs) and then the generated SQL statement becomes too complex for DB2, because the final generated statement prints out all IDs, like this:
SELECT new MyDTO(e.objID) FROM Entity e WHERE e.objId IN (1,2,3,4,5,6,7,...) GROUP BY e.attr1, e.attr2
Is there any good way to handle this kind of query? A possible Workaround would be to write the IDs in a temporary table and then using the IN clause on this table.
You should put all of the values in a table and rewrite the query as a join. This will not only solve your query problem, it should be more efficient as well.
declare global temporary table ids (
objId int
) with replace on commit preserve rows;
--If this statement is too long, use a couple of insert statements.
insert into session.ids values
(1,2,3,4,....);
select new mydto(e.objID)
from entity e
join session.ids i on
e.objId = i.objId
group by e.attr1, e.attr2;

DB2 Query Structure Using User-Defined Function as a Table

I'm a little new to DB2, and am having trouble developing a query. I have created a user-defined function that returns a table of data which I want to then join and select from in larger select statement. I'm working on a sensitive db, so the query below isn't what I'm literally running, but it's almost exactly like it (without the other 10 joins I have to do lol).
select
A.customerId,
A.firstname,
A.lastname,
B.orderId,
B.orderDate,
F.currentLocationDate,
F.currentLocation
from
customer A
INNER JOIN order B
on A.customerId = B.customerId
INNER JOIN table(getShippingHistory(B.customerId)) as F
on B.orderId = F.orderId
where B.orderId = 35
This works great if I run this query without the where clause (or some other where clause that doesn't check for an ID). When I include the where clause, I get the following error:
Error during Prepare 58004(-901)[IBM][CLI Driver][DB2/LINUXX8664]
SQL0901N The SQL statement failed because of a non-severe system
error. Subsequent SQL statements can be processed. (Reason "Bad Plan;
Unresolved QNC found".) SQLSTATE=58004
I have tracked the issue down to fact that I'm using one of join criteria for the parameters (B.customerId). I have validated this fact by replacing B.customerId with a valid customerId, and the query works great. Problem is, I don't know the customerId when calling this query. I know only the orderId (in this example).
Any thoughts on how to restructure this so I can make only 1 call to get all the info? I know the plan is the problem b/c the customerId isn't getting resolved before the function is called.
So if I understand correctly, the function getShippingHistory(customerId) returns a table.
And if you call it with a single customer Id that table gets joined in your query above no problem at all.
But the way you have the query written above, you are asking db2 to call the function for every row returned by your query (i.e. every b.customerId that matches your join and where conditions).
So I'm not sure what behaviour you are expecting, because what you're asking for is a table back for every row in your query, and db2 (nor I) can figure out what the result is supposed to look like.
So in terms of restructuring your query, think about how you can change the getShippingHistory logic when multiple customer Ids are involved.
i found the best solution (given the current query structure) is to use a LEFT join instead of an INNER join in order force the LEFT part of the join to happen which will resolve the customerId to a value by the time it gets to the function call.
select
A.customerId,
A.firstname,
A.lastname,
B.orderId,
B.orderDate,
F.currentLocationDate,
F.currentLocation
from
customer A
INNER JOIN order B
on A.customerId = B.customerId
LEFT JOIN table(getShippingHistory(B.customerId)) as F
on B.orderId = F.orderId
where B.orderId = 35

Advanced queries in linq to entities or stored procedures

I'm currently building an application using entity framework. Normally I would use a stored procedure to get specific data from my database but now i'm experimenting with Entity Framework.
Now i'm facing a small challenge. I have an incident log table with a primary key, an incident id, and some data fields. I need to get all the newest rows for each incident. The sql is quite easy:
select * from incidentLog t
join (select incidentId,max(id) as id from incidentLog group by incidentId) tmp on t.id=tmp.id
How can I convert this to linq to entity?
Can I do it in one operation at all or should I use a stored procedure instead?
This should do the trick:
var query =
(from i in context.IncidentLogs
group i by i.IncidentId into g
let maxID = g.Max(i => i.id)
select g.Where(i => i.id == maxID)).ToList();