How do I use JPQL to delete entries from a join table? - jpa

I have a JPA object which has a many-to-many relationship like this:
#Entity
public class Role {
//...
#ManyToMany(fetch=FetchType.EAGER)
#JoinTable(
name="RolePrivilege",
joinColumns=
#JoinColumn(name="role", referencedColumnName="ID"),
inverseJoinColumns=
#JoinColumn(name="privilege", referencedColumnName="ID")
)
private Set<Privilege> privs;
}
There isn't a JPA object for RolePrivilege, so I'm not sure how to write a JPQL query to delete entries from the privs field of a role object. For instance, I've tried this, but it doesn't work. It complains that Role.privs is not mapped.
DELETE FROM Role.privs p WHERE p.id=:privId
I'm not sure what else to try. I could of course just write a native query which deletes from the join table RolePrivilege. But, I'm worried that doing so would interact badly with locally cached objects which wouldn't be updated by the native query.
Is it even possible to write JPQL to remove entries from a join table like this? If not I can just load all the Role objects and remove entries from the privs collection of each one and then persist each role. But it seems silly to do that if a simple JPQL query will do it all at once.

The JPQL update and delete statements need to refer to an Entity name, not a table name, so I think you're out of luck with the approach you've suggested.
Depending on your JPA provider, you could delete the entries from the JoinTable using a simple raw SQL statement (should be 100% portable) and then programmatically interact with your cache provider API to evict the data. For instance in Hibernate you can call evict() to expire all "Role.privs" collections from the 2nd level cache:
sessionFactory.evictCollection("Role.privs", roleId); //evict a particular collection of privs
sessionFactory.evictCollection("Role.privs"); //evict all privs collections
Unfortunately I don't work with the JPA APIs enough to know for sure what exactly is supported.

I also was looking for a JPQL approach to delete a many-to-many relation (contained in a joining table). Obviously, there's no concept of tables in an ORM, so we can't use DELETE operations... But I wish there were a special operator for these cases. Something like LINK and UNLINK. Maybe a future feature?
Anyway, what I've been doing to achieve this need has been to work with the collections implemented in the entity beans, the ones which map the many-to-many relations.
For example, if I got a class Student which has a many-to-many relation with Courses:
#ManyToMany
private Collection <Courses> collCourses;
I build in the entity a getter and setter for that collection. Then, for example from an EJB, I retrieve the collection using the getter, add or remove the desired Course, and finally, use the setter to assign the new collection. And it's done. It works perfectly.
However my main worry is the performance. An ORM is supposed to keep a huge cache of all objects (if I'm not mistaken), but even using it to work faster, I wonder if retrieving all and every element from a collection is really effective...
Because for me it's as much inefficient as retrieving the registries from a table and post-filter them using pure Java instead of a query language which works direct or undirectly with the internal DB engine (SQL, JPQL...).

Java is an object-oriented language and the whole point of ORM is to hide details of join tables and such from you. Consequently even contemplating deletion from a join table without considering the consequences would be strange.
No you can't do it with JPQL, that is for entities.
Retrieve the entities at either end and clear out their collections. This will remove the join tables entries. Object-oriented.

Related

JPA:Entity manager's find method vs jpql JOIN

If I have an entity A with one-to-many relationship with entity B and I have to fetch A along with associated Bs, I can either use EntityManager's find method OR I can write JPQL query using JOIN.
Which approach is better in terms of efficiency and minimum DB calls?
I'm assuming that the EntityManager.find(class, primary key, ...) invocation will load all associated B's thanks to proper usage of #JoinTable and #OneToMany annotations thus the find and the JPQL JOIN produce the same result in a single invocation.
If that is true then my experience with Hibernate and JPA (2.0/2.1) is that there is no difference especially if you use a secondary cache which I do. Do whatever is convenient.
Having said that the only way to know for sure would be to perform the timing yourself. There are differences not only between different JPA implementations such as Hibernate and EclipseLnk but between different versions of the same system.
Additionally your JPQL performance will vary depending upon if you have the query precompiled (I always do this) or it is a dynamic Criteria JPQL. Additionally various Secondary Caches effect performance greatly.

Efficient querying when using DTOs in Breeze

We are using DTOs server side, and have configured a dbcontext using fluent api in order to give breeze the metadata it needs. We map 1:1 to our real database entities, and each DTO contains a simple subset of the real database entity properties.
This works great, but now I need to figure out a way to make queries efficient - i.e. if the Breeze client queries for a single item I don't want to have to create a whole set of DTO objects before I can filter. i.e. I want to figure out a way to execute the filter/sort on the actual entities, but still return DTO objects.
I guess I need to figure out a way to intercept the query execution in order to query my real database entities and return a DTO instead of the real database entity.
Any ideas for how to best approach this?
Turns out that if you use projection in a link statement, e.g.
From PossibleCustomer As Customer
In Customers
Select New CustomerDto With {.Id = PossibleCustomer.Id,
.Name = PossibleCustomer.Name,
.Email = PossibleCustomer.Email}
.. then linq is smart enough to still optimize any queries to the database - i.e. if I query on the linq statement above to filter for a single item by Id, the database is hit with that query for just a single item and a single DTO is created. Pretty clever stuff. This only works if you do a direct projection in the linq statement - if you call off to a function to create your DTO then this won't work.
Just in case others are facing the same scenario, you might want to look at AutoMapper - it can create these projections for you using a model you create just once - avoids all those huge linq statements that are hard to read and validate. The automapper projections (assuming you stick to the simple stuff) still allow the linq to entities magic that ensures you don't have to do table scans when you create your DTOs.

Aggregate Root support in Entity Framework

How can we tell Entity Framework about Aggregates?
when saving an aggregate, save entities within the aggregate
when deleting an aggregate, delete entities within the aggregate
raise a concurrency error when two different users attempt to modify two different entities within the same aggreate
when loading an aggregate, provide a consistent point-in-time view of the aggregate even if there is some time delay before we access all entities within the aggregate
(Entity Framework 4.3.1 Code First)
EF provides features which allows you defining your aggregates and using them:
This is the most painful part. EF works with entity graphs. If you have an entity like Invoice and this entity has collection of related InvoiceLine entities you can approach it like aggregate. If you are in attached scenario everything works as expected but in detached scenario (either aggregate is not loaded by EF or it is loaded by different context instance) you must attach the aggregate to context instance and tell it exactly what did you changed = set state for every entity and independent association in object graph.
This is handled by cascade delete - if you have related entities loaded, EF will delete them but if you don't you must have cascade delete configured on the relation in the database.
This is handled by concurrency tokens in the database - most commonly either timestamp or rowversion columns.
You must either use eager loading and load all data together at the beginning (= consistent point of view) or you will use lazy loading and in such case you will not have consistent point of view because lazy loading will load current state of relations but it will not update other parts of aggregate you have already loaded (and I consider this as performance killer if you try to implement such refreshing with EF).
I wrote GraphDiff specifically for this purpose. It allows you to define an 'aggregate boundary' on update by providing a fluent mapping. I have used it in cases where I needed to pass detached entity graphs back and forth.
For example:
// Update method of repository
public void Update(Order order)
{
context.UpdateGraph(order, map => map
.OwnedCollection(p => p.OrderItems);
}
The above would tell the Entity Framework to update the order entity and also merge the collection of OrderItems. Mapping in this fashion allows us to ensure that the Entity Framework only manages the graph within the bounds that we define on the aggregate and ignores all other properties. It supports optimistic concurrency checking of all entities. It handles much more complicated scenarios and can also handle updating references in many to many scenarios (via AssociatedCollections).
Hope this can be of use.

Entity Framework Table Per Type Performance

So it turns out that I am the last person to discover the fundamental floor that exists in Microsoft's Entity Framework when implementing TPT (Table Per Type) inheritance.
Having built a prototype with 3 sub classes, the base table/class consisting of 20+ columns and the child tables consisting of ~10 columns, everything worked beautifully and I continued to work on the rest of the application having proved the concept. Now the time has come to add the other 20 sub types and OMG, I've just started looking the SQL being generated on a simple select, even though I'm only interested in accessing the fields on the base class.
This page has a wonderful description of the problem.
Has anyone gone into production using TPT and EF, are there any workarounds that will mean that I won't have to:
a) Convert the schema to TPH (which goes against everything I try to achieve with my DB design - urrrgghh!)?
b) rewrite with another ORM?
The way I see it, I should be able to add a reference to a Stored Procedure from within EF (probably using EFExtensions) that has the the TSQL that selects only the fields I need, even using the code generated by EF for the monster UNION/JOIN inside the SP would prevent the SQL being generated every time a call is made - not something I would intend to do, but you get the idea.
The killer I've found, is that when I'm selecting a list of entities linked to the base table (but the entity I'm selecting is not a subclass table), and I want to filter by the pk of the Base table, and I do .Include("BaseClassTableName") to allow me to filter using x=>x.BaseClass.PK == 1 and access other properties, it performs the mother SQL generation here too.
I can't use EF4 as I'm limited to the .net 2.0 runtimes with 3.5 SP1 installed.
Has anyone got any experience of getting out of this mess?
This seems a bit confused. You're talking about TPH, but when you say:
The way I see it, I should be able to add a reference to a Stored Procedure from within EF (probably using EFExtensions) that has the the TSQL that selects only the fields I need, even using the code generated by EF for the monster UNION/JOIN inside the SP would prevent the SQL being generated every time a call is made - not something I would intend to do, but you get the idea.
Well, that's Table per Concrete Class mapping (using a proc rather than a table, but still, the mapping is TPC...). The EF supports TPC, but the designer doesn't. You can do it in code-first if you get the CTP.
Your preferred solution of using a proc will cause performance problems if you restrict queries, like this:
var q = from c in Context.SomeChild
where c.SomeAssociation.Foo == foo
select c;
The DB optimizer can't see through the proc implementation, so you get a full scan of the results.
So before you tell yourself that this will fix your results, double-check that assumption.
Note that you can always specify custom SQL for any mapping strategy with ObjectContext.ExecuteStoreQuery.
However, before you do any of this, consider that, as RPM1984 points out, your design seems to overuse inheritance. I like this quote from NHibernate in Action
[A]sk yourself whether it might be better to remodel inheritance as delegation in the object model. Complex inheritance is often best avoided for all sorts of reasons unrelated to persistence or ORM. [Your ORM] acts as a buffer between the object and relational models, but that doesn't mean you can completely ignore persistence concerns when designing your object model.
We've hit this same problem and are considering porting our DAL from EF4 to LLBLGen because of this.
In the meantime, we've used compiled queries to alleviate some of the pain:
Compiled Queries (LINQ to Entities)
This strategy doesn't prevent the mammoth queries, but the time it takes to generate the query (which can be huge) is only done once.
You'll can use compiled queries with Includes() as such:
static readonly Func<AdventureWorksEntities, int, Subcomponent> subcomponentWithDetailsCompiledQuery = CompiledQuery.Compile<AdventureWorksEntities, int, Subcomponent>(
(ctx, id) => ctx.Subcomponents
.Include("SubcomponentType")
.Include("A.B.C.D")
.FirstOrDefault(s => s.Id == id));
public Subcomponent GetSubcomponentWithDetails(int id)
{
return subcomponentWithDetailsCompiledQuery.Invoke(ObjectContext, id);
}

Entity to DTO conversion with JPA

I'm using DataNucleus as a JPA implementation to store my classes in my web application. I use a set of converters which all have toDTO() and fromDTO().
My issue is, that I want to avoid the whole DB being sent over the wire:
If I lazy load, the converter will try to access ALL the fields, and load then (resulting in very eager loading).
If I don't lazy load, I'll get a huge part of the DB, since user contains groups, and groups contains users, and so on.
Is there a way to explicitly load some fields and leave the others as NULL in my loaded class?
I've tried the DataNucleus docs with no luck.
Your DTOs are probably too fine-grained. i.e. dont plan to have a DTO per JPA entity. If you have to use DTOs then make them more coarse grained and construct them manually.
Recently we have had the whole "to DTO or not to DTO, that is the question" discussion AGAIN. The requirement for them (especially in the context of a JPA app) is often no longer there, but one of the arguments FOR DTOs tends to be that the view has coarser data requirements.
To only load the data you really require, you would need to use a custom select clause containing only these elements that you are about to use for your DTOs. I know how painful this is, especially when it involves joins, which is why I created Blaze-Persistence Entity Views which will take care of making the query efficient.
You define your DTO as an interface with mappings to the entity, using the attribute name as default mapping, this looks very simple and a lot like a subset of an entity, though it doesn't have to. You can use any JPQL expression as mapping for your DTO attributes.