JPQL In clause error - Statement too complex - jpa

Following is the code which is blowing up if the list which is being passed in to "IN" clause has several values. In my case the count is 1400 values. Also the customer table has several thousands (arround 100,000) of records in it. The query is executing against DERBY database.
public List<Customer> getCustomersNotIn(String custType, List<Int> customersIDs) {
TypedQuery<Customer> query = em.createQuery("from Customer where type=:custType and customerId not in (:customersIDs)", Customer.class);
query.setParameter("custType", custType);
query.setParameter("customersIDs", customersIDs);
List<Customer> customerList = query.getResultList();
return customerList;
}
The above mentioned method perfectly executes if the list has less values ( probably less than 1000 ), if the list customersIDs has more values since the in clause executes based on it, it throws an error saying "Statement too complex"
Since i am new to JPA can any one please tell me how to write the above mention function in the way described below.. * PLEASE READ COMMENTS IN CODE *
public List<Customer> getCustomersNotIn(String custType, List<Int> customersIDs) {
// CREATE A IN-MEMORY TEMP TABLE HERE...
// INSERT ALL VALUES FROM customerIDs collection into temp table
// Change following query to get all customers EXCEPT THOSE IN TEMP TABLE
TypedQuery<Customer> query = em.createQuery("from Customer where type=:custType and customerId not in (:customersIDs)", Customer.class);
query.setParameter("custType", custType);
query.setParameter("customersIDs", customersIDs);
List<Customer> customerList = query.getResultList();
// REMOVE THE TEMP TABLE FROM MEMORY
return customerList;
}

The Derby IN clause support does have a limit on the number of values that can be supplied in the IN clause.
The limit is related to an underlying limitation in the size of a single function in the Java bytecode format; Derby currently implements IN clause execution by generating Java bytecode to evaluate the IN clause, and if the generated bytecode would exceed the JVM's basic limitations, Derby throws the "statement too complex" error.
There have been discussions about ways to fix this, for example see:
DERBY-6784
DERBY-6301, or
DERBY-216
But for now, your best approach is probably to find a way to express your query without generating such a large and complex IN clause.

Ok here is my solution that worked for me. I could not change the part generating the customerList since it is not possible for me, so the solution has to be from within this method. Bryan your explination was the best one, i am still confuse how "in" clause worked perfectly with table. Please see below solution.
public List<Customer> getCustomersNotIn(String custType, List<Int> customersIDs) {
// INSERT customerIds INTO TEMP TABLE
storeCustomerIdsIntoTempTable(customersIDs)
// I AM NOT SURE HOW BUT, "not in" CLAUSE WORKED INCASE OF TABLE BUT DID'T WORK WHILE PASSING LIST VALUES.
TypedQuery<Customer> query = em.createQuery("select c from Customer c where c.customerType=:custType and c.customerId not in (select customerId from TempCustomer)");
query.setParameter("custType", custType);
List<Customer> customerList = query.getResultList();
// REMOVE THE DATA FROM TEMP TABLE
deleteCustomerIdsFromTempTable()
return customerList;
}
private void storeCustomerIdsIntoTempTable(List<Int> customersIDs){
// I ENDED UP CREATING TEMP PHYSICAL TABLE, INSTEAD OF JUST IN MEMORY TABLE
TempCustomer tempCustomer = null;
try{
tempCustomerDao.deleteAll();
for (Int customerId : customersIDs) {
tempCustomer = new TempCustomer();
tempCustomer.customerId=customerId;
tempCustomerDao.save(tempCustomer);
}
}catch(Exception e){
// Do logging here
}
}
private void deleteCustomerIdsFromTempTable(){
try{
// Delete all data from TempCustomer table to start fresh
int deletedCount= tempCustomerDao.deleteAll();
LOGGER.debug("{} customers deleted from temp table", deletedCount);
}catch(Exception e){
// Do logging here
}
}

JPA and the underlying Hibernate simply translate it into a normal JDBC-understood query. You wouldn't write a query with 1400 elements in the IN clause manually, would you? There is no magic. Whatever doesn't work in normal SQL, wouldn't in JPQL.
I am not sure how you get that list (most likely from another query) before you call that method. Your best option would be joining those tables on the criteria used to get those IDs. Generally you want to execute correlated filters like that in one query/transaction which means one method instead of passing long lists around.
I also noticed your customerId is double - a poor choice for a PK. Typically people use long (autoincremented/sequenced, etc.) And I don't get the "temp table" logic.

Related

Entity Framework: Left Join with List Result

I'm trying to optimize my EF queries. I have an entity called Employee. Each employee has a list of tools. Ultimately, I'm trying to get a list of employees with their tools that are NOT broken. When running my query, I can see that TWO calls are made to the server: one for the employee entities and one for the tool list. Again, I'm trying to optimize the query, so the server is hit for a query only once. How can I do this?
I've been exploring with LINQ's join and how to create a LEFT JOIN, but the query is still not optimized.
In my first code block here, the result is what I want, but -- again -- there are two hits to the server.
public class Employee
{
public int EmployeeId { get; set; }
public List<Tool> Tools { get; set; } = new List<Tool>();
...
}
public class Tool
{
public int ToolId { get; set; }
public bool IsBroken { get; set; } = false;
public Employee Employee { get; set; }
public int EmployeeId { get; set; }
...
}
var x = (from e in db.Employees.Include(e => e.Tools)
select new Employee()
{
EmployeeId = e.EmployeeId,
Tools = e.Tools.Where(t => !t.IsBroken).ToList()
}).ToList();
This second code block pseudoly mimics what I'm trying to accomplish. However, the GroupBy(...) is being evaluated locally on the client machine.
(from e in db.Employees
join t in db.Tools.GroupBy(tool => tool.EmployeeId) on e.EmployeeId equals t.Key into empTool
from et in empTool.DefaultIfEmpty()
select new Employee()
{
EmployeeId = e.EmployeeId,
Tools = et != null ? et.Where(t => !t.IsBroken).ToList() : null
}).ToList();
Is there anyway that I can make ONE call to the server as well as not having my GroupBy() evaluate locally and have it return a list of employees with a filtered tool list with tools that are not broken? Thank you.
Shortly, it's not possible (and I don't think it ever will be).
If you really want to control the exact server calls, EF Core is simply not for you. While EF Core still has issues with some LINQ query translation which leads to N+1 query or client evaluation, one thing is by design: unlike EF6 which uses single huge union SQL query for producing the result, EF Core uses one SQL query for the main result set plus one SQL query per each correlated result set.
This is sort of explained in the How Queries Work EF Core documentation section:
The LINQ query is processed by Entity Framework Core to build a representation that is ready to be processed by the database provider
The result is cached so that this processing does not need to be done every time the query is executed
The result is passed to the database provider
The database provider identifies which parts of the query can be evaluated in the database
These parts of the query are translated to database specific query language (for example, SQL for a relational database)
One or more queries are sent to the database and the result set returned (results are values from the database, not entity instances)
Note the word more in the last bullet.
In your case, you have 1 main result set (Employee) + 1 correlated result set (Tool), hence the expected server queries are TWO (except if the first query returns empty set).
You can use this:
var x = from e in _context.Employees
select new
{
e,
Tools = from tool in e.Tools where !tool.IsBroken select tool
};
var result = x.AsEnumerable().Select(y => y.e);
Which will be finally translated to a SQL query like below depending on your provider:
SELECT
`Project1`.`EmployeeId`,
`Project1`.`Name`,
`Project1`.`C1`,
`Project1`.`ToolId`,
`Project1`.`IsBroken`,
`Project1`.`EmployeeId1`
FROM (SELECT
`Extent1`.`EmployeeId`,
`Extent1`.`Name`,
`Extent2`.`ToolId`,
`Extent2`.`IsBroken`,
`Extent2`.`EmployeeId` AS `EmployeeId1`,
CASE WHEN (`Extent2`.`ToolId` IS NOT NULL) THEN (1) ELSE (NULL) END AS `C1`
FROM `Employees` AS `Extent1` LEFT OUTER JOIN `Tools` AS `Extent2` ON (`Extent1`.`EmployeeId` = `Extent2`.`EmployeeId`) AND (`Extent2`.`IsBroken` != 1)) AS `Project1`
ORDER BY
`Project1`.`EmployeeId` ASC,
`Project1`.`C1` ASC
I change my previous answer which was wrong, thanks to comments.

Doing subqueries in Mybatis, or query recursively the selected values

UPDATE:
I understood that the solution to my problem is doing subqueries, which apply a different filter each time, and they have a reduced result set. But I can't find a way to do that in MyBatis logic. Here is my query code
List<IstanzaMetadato> res = null;
SqlSession sqlSession = ConnectionFactory.getSqlSessionFactory().openSession(true);
try {
IstanzaMetadatoMapper mapper = sqlSession.getMapper(IstanzaMetadatoMapper.class);
IstanzaMetadatoExample example = new IstanzaMetadatoExample();
Iterator<Map.Entry<Integer, String>> it = map.entrySet().iterator();
while (it.hasNext()) {
Map.Entry<Integer, String> entry = it.next();
example.createCriteria().andIdMetadatoEqualTo(entry.getKey()).andValoreEqualTo(entry.getValue());
}
example.setDistinct(true);
res = mapper.selectByExample(example);
I need to execute a new selectByExample but inside the while cycle, and it has to query the previus "SELECTED" results....
Is there a Solution ?
ORIGINAL QUESTION:
I have this table structure
I have to select rows from the table with different filters, specified by the final user.
Those filters are specified by a couple (id_metadato, valore), in example you can have id_metadato = 3 and valore = "pippo";
the user can specify 0-n filters from the web page typing 0-n values inside the search boxes which are based on id_metadato
Obviusly, the more filters the users specifies, the more restriction would have the final query.
In example if the user fills only the first search box, the query will have only a filter and would provide all the rows that will have the couple (id_metadato, valore) specified by the user.
If he uses two search boxes, than the query will have 2 filters, and it will provide all the rows that verify the first condition AND the second one, after the "first subquery" is done.
I need to do this dinamically, and in the best efficient way. I can't simply add AND clause to my query, they have to filter and reduce the result set every time.
I can't do 0-n subqueries (Select * from ... IN (select * from ....) ) efficiently.
Is there a more elegant way to do that ? I'm reading dynamic SQL queries tutorials with MyBatis, but I'm not sure that is the correct way. I'm still trying to figure out the logic of the resosultio, then I will try to implement with MyBatis.
Thanks for the answers
MyBatis simplified a lot this process of nesting subqueries, it was sufficient to concatenate the filter criterias and to add
the excerpt of the code is the following
try {
IstanzaMetadatoMapper mapper = sqlSession.getMapper(IstanzaMetadatoMapper.class);
IstanzaMetadatoExample example = new IstanzaMetadatoExample();
Iterator<Map.Entry<Integer, String>> it = map.entrySet().iterator();
while (it.hasNext()) {
Map.Entry<Integer, String> entry = it.next();
if (listaIdUd.isEmpty()) {
example.createCriteria().andIdMetadatoEqualTo(entry.getKey()).andValoreEqualTo(entry.getValue());
example.setDistinct(true);
listaIdUd = mapper.selectDynamicNested(example);
continue;
}
example.clear();
example.createCriteria().andIdMetadatoEqualTo(entry.getKey()).andValoreEqualTo(entry.getValue()).andIdUdIn(listaIdUd);
example.setDistinct(true);
listaIdUd = mapper.selectDynamicNested(example);
}

Having conditional multiple filters in Morphia query for Mongo database

Environment : MongoDb 3.2, Morphia 1.1.0
So lets say i am having a collection of Employees and Employee entity has several fields. I need to do something like apply multiple filters (conditional) and return a batch of 10 records per request.
pesudocode as below.
#Entity("Employee")
Employee{
String firstname,
String lastName,
int salary,
int deptCode,
String nationality
}
and in my EmployeeFilterRequesti carry the request parameter to the dao
EmployeeFilterRequest{
int salaryLessThen
int deptCode,
String nationality..
}
Pseudoclass
class EmployeeDao{
public List<Employee> returnList;
public getFilteredResponse(EmployeeFilterRequest request){
DataStore ds = getTheDatastore();
Query<Employee> query = ds.createQuery(Emploee.class).disableValidation();
//conditional request #1
if(request.filterBySalary){
query.filter("salary >", request.salary);
}
//conditional request #2
if(request.filterBydeptCode){
query.filter("deptCode ==", request.deptCode);
}
//conditional request #3
if(request.filterByNationality){
query.filter("nationality ==", request.nationality);
}
returnList = query.batchSize(10).asList();
/******* **THIS IS RETURNING ME ALL THE RECORDS IN THE COLLECTION, EXPECTED ONLY 10** *****/
}
}
SO as explained above in the code.. i want to perform conditional filtering on multiple fields. and even if batchSize is present as 10, i am getting complete records in the collection.
how to resolve this ???
Regards
Punith
Blakes is right. You want to use limit() rather than batchSize(). The batch size only affects how many documents each trip to the server comes back with. This can be useful when pulling over a lot of really large documents but it doesn't affect the total number of documents fetched by the query.
As a side note, you should be careful using asList() as it will create objects out of every document returned by the query and could exhaust your VM's heap. Using fetch() will let you incrementally hydrate documents as you need each one. You might actually need them all as a List and with a size of 10 this is probably fine. It's just something to keep in mind as you work with other queries.

Count in jpa without getting result [duplicate]

I like the idea of Named Queries in JPA for static queries I'm going to do, but I often want to get the count result for the query as well as a result list from some subset of the query. I'd rather not write two nearly identical NamedQueries. Ideally, what I'd like to have is something like:
#NamedQuery(name = "getAccounts", query = "SELECT a FROM Account")
.
.
Query q = em.createNamedQuery("getAccounts");
List r = q.setFirstResult(s).setMaxResults(m).getResultList();
int count = q.getCount();
So let's say m is 10, s is 0 and there are 400 rows in Account. I would expect r to have a list of 10 items in it, but I'd want to know there are 400 rows total. I could write a second #NamedQuery:
#NamedQuery(name = "getAccountCount", query = "SELECT COUNT(a) FROM Account")
but it seems a DRY violation to do that if I'm always just going to want the count. In this simple case it is easy to keep the two in sync, but if the query changes, it seems less than ideal that I have to update both #NamedQueries to keep the values in line.
A common use case here would be fetching some subset of the items, but needing some way of indicating total count ("Displaying 1-10 of 400").
So the solution I ended up using was to create two #NamedQuerys, one for the result set and one for the count, but capturing the base query in a static string to maintain DRY and ensure that both queries remain consistent. So for the above, I'd have something like:
#NamedQuery(name = "getAccounts", query = "SELECT a" + accountQuery)
#NamedQuery(name = "getAccounts.count", query = "SELECT COUNT(a)" + accountQuery)
.
static final String accountQuery = " FROM Account";
.
Query q = em.createNamedQuery("getAccounts");
List r = q.setFirstResult(s).setMaxResults(m).getResultList();
int count = ((Long)em.createNamedQuery("getAccounts.count").getSingleResult()).intValue();
Obviously, with this example, the query body is trivial and this is overkill. But with much more complex queries, you end up with a single definition of the query body and can ensure you have the two queries in sync. You also get the advantage that the queries are precompiled and at least with Eclipselink, you get validation at startup time instead of when you call the query.
By doing consistent naming between the two queries, it is possible to wrap the body of the code to run both sets just by basing the base name of the query.
Using setFirstResult/setMaxResults do not return a subset of a result set, the query hasn't even been run when you call these methods, they affect the generated SELECT query that will be executed when calling getResultList. If you want to get the total records count, you'll have to SELECT COUNT your entities in a separate query (typically before to paginate).
For a complete example, check out Pagination of Data Sets in a Sample Application using JSF, Catalog Facade Stateless Session, and Java Persistence APIs.
oh well you can use introspection to get named queries annotations like:
String getNamedQueryCode(Class<? extends Object> clazz, String namedQueryKey) {
NamedQueries namedQueriesAnnotation = clazz.getAnnotation(NamedQueries.class);
NamedQuery[] namedQueryAnnotations = namedQueriesAnnotation.value();
String code = null;
for (NamedQuery namedQuery : namedQueryAnnotations) {
if (namedQuery.name().equals(namedQueryKey)) {
code = namedQuery.query();
break;
}
}
if (code == null) {
if (clazz.getSuperclass().getAnnotation(MappedSuperclass.class) != null) {
code = getNamedQueryCode(clazz.getSuperclass(), namedQueryKey);
}
}
//if not found
return code;
}

TypedQuery<x> returns vector of Object[] instead of list of x-type object

I have a method:
public List<Timetable> getTimetableTableForRegion(String id) {
List<Timetable> timetables;
TypedQuery<Timetable> query = em_read.createQuery("SELECT ..stuff.. where R.id = :id", Timetable.class).setParameter("id", Long.parseLong(id));
timetables = query.getResultList();
return timetables;
}
which returns this:
so, what am I missing in order to return a list of Timetable's?
ok, so, ..stuff.. part of my JPQL contained an inner join to other table. Even through in SELECT there were selected fields just from one table, which was used as type - Timetable, Eclipslink was unable to determine if this fields are part of that entity and instead of returning list of defined entity returned list of Object[].
So in conclusion: Use #OneToMany/#ManyToOne mappings (or flat table design) and query just for ONE table in your JPQL to be able to typize returned entities.
Not sure it might be something is looking for, but I had similar problem and converted Vector to ArrayList like this:
final ArrayList<YourClazz> results = new ArrayList<YourClazz>();;
for ( YourClazzkey : (Vector<YourClazz>) query.getResultList() )
{
results.add(key);
}
i have faced the same problem. and my entity has no one to one or one to many relationship. then also jpql was giving me queryresult as vector of objects. i changed my solution to query to criteria builder. and that worked for me.
code snippet is as below:
CriteriaBuilder builder = this.entityManager.getCriteriaBuilder();
CriteriaQuery<Timetable> criteria = builder.createQuery(Timetable.class);
Root<Enumeration> root = criteria.from(Timetable.class);
criteria.where(builder.equal(root.get("id"), id));
List<Timetable> topics = this.entityManager.createQuery(criteria) .getResultList();
return topics;