Criteria API: Fetch of a list returns repeated main entity - jpa

I have the following Entities; Ticket contains a set of 0,N WorkOrder:
#Entity
public class Ticket {
...
#OneToMany(mappedBy="ticket", cascade = CascadeType.ALL, fetch = FetchType.LAZY)
private List<WorkOrder> workOrders = null;
...
}
#Entity
public class WorkOrder {
...
#ManyToOne
#JoinColumn(nullable = false)
private Ticket ticket;
}
I am loading Tickets and fetching the attributes. All of the 0,1 attributes present no problem. For workOrders, I used this answer to get the following code.
CriteriaBuilder criteriaBuilder = this.entityManager.getCriteriaBuilder();
CriteriaQuery<Ticket> criteriaQuery = criteriaBuilder
.createQuery(Ticket.class);
Root<Ticket> rootTicket = criteriaQuery.from(Ticket.class);
ListAttribute<? super Ticket, WorkOrder> workOrders =
rootTicket.getModel().getList("workOrders", WorkOrder.class);
rootTicket.fetch(workOrders, JoinType.LEFT);
// WHERE logic
...
criteriaQuery.select(rootTicket);
TypedQuery<Ticket> query = this.entityManager.createQuery(criteriaQuery);
return query.getResultList();
The result is that, in a query that should return me 1 Ticket with 5 workOrders, I am retrieving the same Ticket 5 times.
If I just make the workOrders an Eager Fetch and delete the fetch code, it works as it should.
Can anyone help me? Thanks in advance.
UPDATE:
One explanation about why I am not just happy with JB Nizet's answer (even if in the end it works).
When I just make the relationship eager, JPA is examining exactly the same data that when I make it lazy and add the fetch clause to the Criteria / JPQL. The relationships between the various elements is also clear, as I define the ListAttribute for the Criteria query.
There is some reasonable explanaition for the reason that JPA does not return the same data in both cases?
UPDATE FOR BOUNTY: While JB Nizet's answer did solve the issue, I still find it meaningless that, given two operations with the same meaning ("Get Ticket and fetch all WorkOrder inside ticket.workOrders"), doing them by an eager loading needs no further changes while specifying a fetch requires a DISTINCT command

There is a difference between eager loading and fetch join. Eager loading doesn't mean that the data is loaded within the same query. It just means that it is loaded immediately, although by additional queries.
The criteria is always translated to an SQL query. If you specify joins, it will be join in SQL. By the nature of SQL, this multiplies the data of the root entity as well, which leads to the effect you got. (Note that you get the same instance multiple times, so the root entity is not multiplied in memory.)
There are several solutions to that:
use distinct(true)
Use the distinct root entity transformer (.setResultTransformer(Criteria.DISTINCT_ROOT_ENTITY)).
When you don't need to filter by child properties, avoid the join
When you need to filter by child properties, filter by a subquery (DetachedCriteria).
Optimize the N+1 problem by using batch-size

Have you tried calling distinct(true) on the CriteriaQuery?
The JPA 2 specification, page 161, says:
The DISTINCT keyword is used to specify that duplicate values must be
eliminated from the query result.
If DISTINCT is not specified, duplicate values are not eliminated.
The javadoc also says:
Specify whether duplicate query results will be eliminated.A true
value will cause duplicates to be eliminated. A false value will cause
duplicates to be retained. If distinct has not been specified,
duplicate results must be retained.
The reason why you don't need the distinct when the association is eagerly loaded is probably just that the association is not loaded using a fetch join, but using an additional query.

Related

Join database query Vs Handeling Joins in API

I am developing an API where I am confused as to what is the efficient way to handle join query.
I want to join 2 tables data and return the response. Either I can query the database with join query and fetch the result and then return the response OR I can fire two separate queries and then I would handle the join in the API on the fly and return the response. Which is the efficient and correct way ?
Databases are pretty much faster than querying and joining as class instances. Always do joins in the database and map them from the code. Also look for any lazy loading if possible. Cause in a situation like below:
#Entity
#Table(name = "USER")
public class UserLazy implements Serializable {
#Id
#GeneratedValue
#Column(name = "USER_ID")
private Long userId;
#OneToMany(fetch = FetchType.LAZY, mappedBy = "user")
private Set<OrderDetail> orderDetail = new HashSet();
// standard setters and getters
// also override equals and hashcode
}
you might not want order details when you want the initial results.
Usually it's more efficient to do the join in the database, but there are some corner cases, mostly due to the fact that application CPU time is cheaper than database CPU time. Here are a few examples that come to mind, with a query like "table A join table B":
B is a small table that rarely changes.
In this case it can be profitable to cache the contents of this table in the application, and not query it at all.
Rows in A are quite large, and many rows of B are selected for each row of A.
This will cause useless network traffic and load as rows from A are duplicated many times in each result row.
Rows in B are quite large, and there are few distinct b_id's in A
Same as above, except this time the same few rows from B are duplicated in the result set.
In the previous two examples, it could be useful to perform the query on table A, then gather a set of unique b_id's from the result, and SELECT FROM b WHERE b_id IN (list).
Data structure and ORMs
If each table contains a different object type, and they have a "belongs to" relationship (like category and product) and you use an ORM which will instantiate objects for each selected row, then perhaps you only want one instance of each category, and not one per selected product. In this case, you could select the products, gather a list of unique category_ids, and select the categories from there. The ORM may even do that for you behind the scene.
Complicated aggregates
Sometimes, you want some stuff, and some aggregates of other stuff related to the first stuff, but it just won't fit in a neat GROUP BY, or you may need several ones.
So basically, usually the join works better in the database, so that should be the default. If you do it in the application, then you should know why you're doing it, and decide it's a good reason. If it is, then fine. I gave a few reasons, based on performance, data model, and SQL constraints, these are only examples of course.

CriteriaBuilder and using alias for select cause

I would like to create a query with CriteriaBuilder for this kind of sql;
SELECT myDefinedAlias.id, myDefinedAlias.name, myDefinedAlias.aFieldForFK select from Person as myDefinedAlias where myDefinedAlias.name = ?1
How can i accomplish defining an alias for this?
I can create queries without aliases but i cannot define aliases...
CriteriaQuery<Person> cq = criteriBuilder.createQuery(Person.class);
Root<Person> person = cq.from(Person.class);
cq = cq.select(person);
cq = cq.where(criteriaBuilder.equal(person.get(Person_.name), "Chivas")))
I need this for QueryHints, batch fetch.
.setHint(QueryHints.BATCH, "myDefinedAlias.aFieldForFK.itsNestedAttribute");
I am stuck and couldn't find anything regarding my problem. Anyone?
Regards
Doing cq.select(person).alias("myDefinedAlias") assigns your alias which can subsequently be used in batch/fetch query hints. Eclipselink supports nested fetch joins as long as you do not transfer to-Many relations (collections).
I.e..setHint(QueryHints.BATCH, myDefinedAlias.toOneRelation.toManyRelation") works while .setHint(QueryHints.BATCH, .setHint(QueryHints.BATCH, "myDefinedAlias.toManyRelation.toOneRelation") shouldn't.
I think you are going about this the wrong way. JPA needs the sql-statement-aliases for itself to use when generating sql-statements.
For the nested query hints to work, the relationship needs to be specified in the entities.
For example, if your Person entity have a OneToMany mapping to a House entity - and the property name in the Person class is livedInHouses. The query hint would become:
.setHint(QueryHints.BATCH, "Person.livedInHouses"). Its damn near impossible to use FKs that exists in the database but are not annotated as relations on/in entities in JPA.

returning collections using linking tables and JPA

After struggling for days attempting to get back collections that are linked to a table via a foreign key, I just realized that the tables I am linking to are actually LINKING tables to other tables with the actual data (chock one up for normalized tables).
I am still struggling to get collections out of ManyToOne annotated variables with references to foreign keys, but is there a way I can pull the data back from the table that actually contains the information? Has anyone run into an instance of this?
UPDATE: AS per request I will be posting some code instances... This would be my named query in the entity that I will be calling...
#NamedQuery(name="getQuickLaunchWithCollections", query = "SELECT q FROM QuickLaunch q LEFT JOIN FETCH q.quickLaunchDistlistCollection LEFT JOIN FETCH q.quickLaunchPermCollection LEFT JOIN FETCH q.quickLaunchProviderCollection")})
These would be the collections that I am looking to fill...
#OneToMany(mappedBy="quickLaunchId", fetch=FetchType.EAGER)
private List<QuickLaunchPerm> quickLaunchPermCollection;
#OneToMany(mappedBy="quickLaunchId", fetch=FetchType.EAGER)
private List<QuickLaunchProvider> quickLaunchProviderCollection;
#OneToMany(mappedBy="quickLaunchId", fetch=FetchType.EAGER)
private List<QuickLaunchDistlist> quickLaunchDistlistCollection;
As you can see, I have the fetch type set to eager. So technically, I should be getting some data back? But in actuality those are just linking tables the data that I actually want to pull back. I will need to figure out how to get that data back eventually.
This is how I am calling that named query...
listQL = emf.createNamedQuery("getQuickLaunchWithCollections").getResultList();
Alright, it appears as though LEFT JOIN FETCH
is causing my runtime to throw an expception of some kind. It is pretty unclear as to what it is. But I feel as though I am getting no where with that technique. I am going to try something slightly different.
I would suggest simplifying example, to face the problem, since you are going worldwide now.
Specifying mappedBy="quickLaunchId" attribute, you are saying, that QuickLaunchPerm entity has QuickLaunch as its property named "quickLaunchId". Is this true?
If it is not, then you need to define it in QuickLaunchPerm:
#ManyToOne(fetch = FetchType.EAGER)
#JoinColumn(name = "QUICK_LAUNCH_ID")
private QuickLaunch quickLaunchId;
//getters setters

jpa lazy fetch entities over multiple levels with criteria api

I am using JPA2 with it's Criteria API to select my entities from the database. The implementation is OpenJPA on WebSphere Application Server. All my entities are modeled with Fetchtype=Lazy.
I select an entity with some criteria from the database and want to load all nested data from sub-tables at once.
If I have a datamodel where table A is joined oneToMany to table B, I can use a Fetch-clause in my criteria query:
CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<A> cq = cb.createQuery(A.class);
Root<A> root = cq.from(A.class);
Fetch<A,B> fetch = root.fetch(A_.elementsOfB, JoinType.LEFT);
This works fine. I get an element A and all of its elements of B are filled correctly.
Now table B has a oneToMany-relationship to table C and I want to load them too. So I add the following statement to my query:
Fetch<B,C> fetch2 = fetch.fetch(B_.elementsOfC, JoinType.LEFT);
But this wont do anything.
Does anybody know how to fetch multi level entities in one query?
It does not work with JPQL and there is no way to make it work in CriteriaQueries either. Specification limits fetched entities to the ones in that are referenced directly from the returned entity:
About fetch join with CriteriaQuery:
An association or attribute referenced by the fetch method must be
referenced from an entity or embeddable that is returned as the result
of the query.
About fetch join in JPQL:
The association referenced by the right side of the FETCH JOIN clause
must be an association or ele ment collection that is referenced from
an entity or embeddable that is returned as a result of the query.
Same limitation is also told in OpenJPA documentation.
For what is worth. I do this all the time and it works just fine.
Several points:
I'm using jpa 2.1, but I'm almost sure it used to work in jpa 2.0 as well.
I'm using the criteria api, and I know some things work diferent in jpql. So don't think it works some way or doesn't work because that's what happens in jpql. Most often they do behave in the same way, but not always.
(Also i'm using plain criteria api, no querydsl or anything. Sometimes it makes a difference)
My associations tend to be SINGULAR_ATTRIBUTE. So maybe that's the problem here. Try a test with the joins in reverse "c.fetch(b).fetch(a)" and see if that works. I know it's not the same, but just to see if it gives you any hint. I'm almost sure I have done it with onetomany left fetch joins too, though.
Yep. I just checked and found it: root.fetch("targets", LEFT).fetch("destinations", LEFT).fetch("internal", LEFT)
This has been working without problems for months, maybe more than a year.
I just run a test and it generates this query:
select -- all fields from all tables
...
from agreement a
left outer join target t on a.id = t.agreement_id
left outer join destination d on t.id = d.target_id
left outer join internal i on d.id = i.destination_id
And returns all rows with all associations with all fields.
Maybe the problem is a different thing. You just say "it wont do anyhting". I don't know if it throws an exception or what, but maybe it executes the query properly but doesn't return the rows you expect because of some conditions or something like that.
You could design a view in the DB joining tables b and c, create the entity and fetchit insted of the original entity.

Select N+1 in next Entity Framework

One of the few valid complaints I hear about EF4 vis-a-vis NHibernate is that EF4 is poor at handling lazily loaded collections. For example, on a lazily-loaded collection, if I say:
if (MyAccount.Orders.Count() > 0) ;
EF will pull the whole collection down (if it's not already), while NH will be smart enough to issue a select count(*)
NH also has some nice batch fetching to help with the select n + 1 problem. As I understand it, the closest EF4 can come to this is with the Include method.
Has the EF team let slip any indication that this will be fixed in their next iteration? I know they're hard at work on POCO, but this seems like it would be a popular fix.
What you describe is not N+1 problem. The example of N+1 problem is here. N+1 means that you execute N+1 selects instead of one (or two). In your example it would most probably mean:
// Lazy loads all N Orders in single select
foreach(var order in MyAccount.Orders)
{
// Lazy loads all Items for single order => executed N times
foreach(var orderItem in order.Items)
{
...
}
}
This is easily solved by:
// Eager load all Orders and their items in single query
foreach(var order in context.Accounts.Include("Orders.Items").Where(...))
{
...
}
Your example looks valid to me. You have collection which exposes IEnumerable and you execute Count operation on it. Collection is lazy loaded and count is executed in memory. The ability for translation Linq query to SQL is available only on IQueryable with expression trees representing the query. But IQueryable represents query = each access means new execution in DB so for example checking Count in loop will execute a DB query in each iteration.
So it is more about implementation of dynamic proxy.
Counting related entities without loading them will is already possible in Code-first CTP5 (final release will be called EF 4.1) when using DbContext instead of ObjectContext but not by direct interaction with collection. You will have to use something like:
int count = context.Entry(myAccount).Collection(a => a.Orders).Query().Count();
Query method returns prepared IQueryable which is probably what EF runs if you use lazy loading but you can further modify query - here I used Count.