Can you just clear specific entity types from JPA Entity Manager cache - jpa

In the raw JPA specification (so not a specific implementation like Hibernate or EclipseLink), is there an option to call em.clear() for a specific type of entity but not for others?
The reason I'm asking is, I have a process which creates and stores tens of thousands of entities; but each is not needed again by the entity manager after it's been persisted. However, each entity has a child entity which is one of about 30 records in the database. So I want to be able to call something like em.clear(MainEntity.class), meaning I don't clear the cache of the child entity which is reused, but also don't build up a history of tens of thousands of unneeded records.
Here's a simplified example of what I am currently doing:
List<Child> children = (List<Child>)em.createQuery("Select c from " + Child.entityName + " c").getResultList();
for (int i=0;i<700000;i++)
{
Parent p = createParent(i, getRandomChild(children));
em.persist(p);
if (i%1000 == 0)
{
em.flush(); // push to db
em.clear(); // remove old items from cache - but unfortunately removes children too
// as I've cleared the cache, I need to repopulate this to reattach the children again
children = (List<Child>)em.createQuery("Select c from " + Child.entityName + " c").getResultList();
}
}

You can use
void detach(java.lang.Object entity)
like this
em.detach(p);
to remove only parent from the persistence context.

Related

OneToMany relationsip of entity in persistence context is not updated

There is 3 entities in MxN relationship, B being association entity. We create them in single TX, persist all of them, and fetch entity with OneToMany association. This association is not initialized after fetch.
Source: https://github.com/alfonz19/springboot222demo/commits/what
#Transactional
#Test
void contextLoads() {
// for(int i = 0; i < 3; i++) {
UUID aId = UUID.randomUUID();
AEntity aEntity = aRepository.save(new AEntity().setId(aId));
UUID bId = UUID.randomUUID();
CEntity cEntity = cRepository.save(new CEntity().setId(bId));
em.flush();
bRepository.save(new BEntity().setAEntity(aEntity).setCEntity(cEntity));
// }
em.flush();
// em.clear();
Iterable<CEntity> centities = cRepository.findAll();
List<BEntity> bEntities =
iterableToStream(centities).flatMap(e -> e.getBEntities().stream()).collect(Collectors.toList());
Assert.assertThat(centities, Matchers.iterableWithSize(1));
Assert.assertThat(bRepository.findAll(), Matchers.iterableWithSize(1));
Assert.assertThat(bEntities.size(), CoreMatchers.is(1));
...
}
Ok, I understand, that when creating BEntity I do not update AEntity and CEntity leaving them corrupted. Calling cRepository.findAll() then does call select on db to get all Cs (even without any evict/flush/clear) but leaves #OneToMany uninitialized. I don't get it. I would understand, if there woulndn't be no call to db at all, but if I fetch Cs anyway to refresh it, why not refresh also the association table. Why's that?
Even more suprisingly aRepository.save(new AEntity().setId(aId)) when doing em.merge (entity has assigned id) the hibernate does load whole MxN structure using 2 left outer joins, even if #OneToMany is lazy. Why's that?? EDIT: ok, that's not surprising at all, that's implication of cascade merge. Compeletely ok.
I'm little bit surprised by this behavior, as there are select issued where they shouldn't be (IIUC), and there aren't ones, where they easily could be.
And to keep the best to the end. With small change: uncommenting for loop and clear, I'm getting full nondeterministic behavior.
source: https://github.com/alfonz19/springboot222demo/tree/nondeterministic
Tests will either work, or produces exception like:
array out of bounds
collection with cascade="all-delete-orphan" was no longer referenced by the owning entity instance:
java.lang.NullPointerException
but if I put breakpoint on bEntities variable declaration, cEntities are always correctly created and test then pass. I have no idea what can cause this.
I have answer to non-deterministic behavior problem bonus-question.
One more randomly generated exceptions to the list is org.springframework.orm.jpa.JpaSystemException: Found shared references to a collection and all this behavior just disappers with removal of flatMap. Ie replace:
List<BEntity> bEntities =
StreamSupport.stream(centities.spliterator(), true).flatMap(e -> e.getBEntities().stream()).collect(Collectors.toList());
with
List<BEntity> bEntities = new LinkedList<>();
centities.forEach(e->bEntities.addAll(e.getBEntities()));
and test in (not anymore) "nondeterministic" branch will pass 100%. Not sure why, however it seems, that stream-api is not that safe with hibernate-managed collections.

Entity Framework - "Attach()" is slow

I'm using EF5 and attaching a disconnected graph of POCO entities to my context, something like this:-
using (var context = new MyEntities())
{
context.Configuration.AutoDetectChangesEnabled = false;
context.MyEntities.Attach(myEntity);
// Code to walk the entity graph and set each entity's state
// using ObjectStateManager omitted for clarity ..
context.SaveChanges();
}
The entity "myEntity" is a large graph of entities, with many child collections, which in turn have their own child collections, and so on. The entire graph contains in the order of 10000 entities, but only a small number are usually changed.
The code to set the entity states and the actual SaveChanges() is fairly quick (<200ms). It's the Attach() that's the problem here, and takes 2.5 seconds, so I was wondering if this could be improved. I've seen articles that tell you to set AutoDetectChangesEnabled = false, which I'm doing above, but it makes no difference in my scenario. Why is this?
I am afraid that 2,5 sec for attaching an object graph with 10000 entities is "normal". It's probably the entity snapshot creation that takes place when you attach the graph that takes this time.
If "only a small number are usually changed" - say 100 - you could consider to load the original entities from the database and change their properties instead of attaching the whole graph, for example:
using (var context = new MyEntities())
{
// try with and without this line
// context.Configuration.AutoDetectChangesEnabled = false;
foreach (var child in myEntity.Children)
{
if (child.IsModified)
{
var childInDb = context.Children.Find(child.Id);
context.Entry(childInDb).CurrentValues.SetValues(child);
}
//... etc.
}
//... etc.
context.SaveChanges();
}
Although this will create a lot of single database queries, only "flat" entities without navigation properties will be loaded and attaching (that occurs when calling Find) won't consume much time. To reduce the number of queries you could also try to load entities of the same type as a "batch" using a Contains query:
var modifiedChildIds = myEntity.Children
.Where(c => c.IsModified).Select(c => c.Id);
// one DB query
context.Children.Where(c => modifiedChildIds.Contains(c.Id)).Load();
foreach (var child in myEntity.Children)
{
if (child.IsModified)
{
// no DB query because the children are already loaded
var childInDb = context.Children.Find(child.Id);
context.Entry(childInDb).CurrentValues.SetValues(child);
}
}
It's just a simplified example under the assumption that you only have to change scalar properties of the entities. It can become arbitrarily more complex if modifications of relationships (children have been added to and/or deleted from the collections, etc.) are involved.

Having a hard time with Entity Framework detached POCO objects

I want to use EF DbContext/POCO entities in a detached manner, i.e. retrieve a hierarchy of entities from my business tier, make some changes, then send the entire hierarchy back to the business tier to persist back to the database. Each BLL call uses a different instance of the DbContext. To test this I wrote some code to simulate such an environment.
First I retrieve a Customer plus related Orders and OrderLines:-
Customer customer;
using (var context = new TestContext())
{
customer = context.Customers.Include("Orders.OrderLines").SingleOrDefault(o => o.Id == 1);
}
Next I add a new Order with two OrderLines:-
var newOrder = new Order { OrderDate = DateTime.Now, OrderDescription = "Test" };
newOrder.OrderLines.Add(new OrderLine { ProductName = "foo", Order = newOrder, OrderId = newOrder.Id });
newOrder.OrderLines.Add(new OrderLine { ProductName = "bar", Order = newOrder, OrderId = newOrder.Id });
customer.Orders.Add(newOrder);
newOrder.Customer = customer;
newOrder.CustomerId = customer.Id;
Finally I persist the changes (using a new context):-
using (var context = new TestContext())
{
context.Customers.Attach(customer);
context.SaveChanges();
}
I realise this last part is incomplete, as no doubt I'll need to change the state of the new entities before calling SaveChanges(). Do I Add or Attach the customer? Which entities states will I have to change?
Before I can get to this stage, running the above code throws an Exception:
An object with the same key already exists in the ObjectStateManager.
It seems to stem from not explicitly setting the ID of the two OrderLine entities, so both default to 0. I thought it was fine to do this as EF would handle things automatically. Am I doing something wrong?
Also, working in this "detached" manner, there seems to be an lot of work required to set up the relationships - I have to add the new order entity to the customer.Orders collection, set the new order's Customer property, and its CustomerId property. Is this the correct approach or is there a simpler way?
Would I be better off looking at self-tracking entities? I'd read somewhere that they are being deprecated, or at least being discouraged in favour of POCOs.
You basically have 2 options:
A) Optimistic.
You can proceed pretty close to the way you're proceeding now, and just attach everything as Modified and hope. The code you're looking for instead of .Attach() is:
context.Entry(customer).State = EntityState.Modified;
Definitely not intuitive. This weird looking call attaches the detached (or newly constructed by you) object, as Modified. Source: http://blogs.msdn.com/b/adonet/archive/2011/01/29/using-dbcontext-in-ef-feature-ctp5-part-4-add-attach-and-entity-states.aspx
If you're unsure whether an object has been added or modified you can use the last segment's example:
context.Entry(customer).State = customer.Id == 0 ?
EntityState.Added :
EntityState.Modified;
You need to take these actions on all of the objects being added/modified, so if this object is complex and has other objects that need to be updated in the DB via FK relationships, you need to set their EntityState as well.
Depending on your scenario you can make these kinds of don't-care writes cheaper by using a different Context variation:
public class MyDb : DbContext
{
. . .
public static MyDb CheapWrites()
{
var db = new MyDb();
db.Configuration.AutoDetectChangesEnabled = false;
db.Configuration.ValidateOnSaveEnabled = false;
return db;
}
}
using(var db = MyDb.CheapWrites())
{
db.Entry(customer).State = customer.Id == 0 ?
EntityState.Added :
EntityState.Modified;
db.SaveChanges();
}
You're basically just disabling some extra calls EF makes on your behalf that you're ignoring the results of anyway.
B) Pessimistic. You can actually query the DB to verify the data hasn't changed/been added since you last picked it up, then update it if it's safe.
var existing = db.Customers.Find(customer.Id);
// Some logic here to decide whether updating is a good idea, like
// verifying selected values haven't changed, then
db.Entry(existing).CurrentValues.SetValues(customer);

EntityManager doesn't refresh the data after querying

My current project uses HSQLDB2.0 and JPA2.0 .
The scenario is: I query DB to get list of contactDetails of person. I delete single contactInfo at UI but do not save that data (Cancel the saving part).
I again do the same query, now the result list is 1 lesser than previous result coz I have deleted one contactInfo at UI. But that contactInfo is still available at DB if I cross check.
But if I include entityManager.clear() before start of the query, I get correct results every time.
I dont understand this behaviour. Could anyone make it clear for me?
Rather than querying again, try this:
entityManager.refresh(person);
A more complete example:
EntityManagerFactory factory = Persistence.createEntityManagerFactory("...");
EntityManager em = factory.createEntityManager();
em.getTransaction().begin();
Person p = (Person) em.find(Person.class, 1);
assertEquals(10, p.getContactDetails().size()); // let's pretend p has 10 contact details
p.getContactDetails().remove(0);
assertEquals(9, p.getContactDetails().size());
Person p2 = (Person) em.find(Person.class, 1);
assertTrue(p == p2); // We're in the same persistence context so p == p2
assertEquals(9, p.getContactDetails().size());
// In order to reload the actual patients from the database, refresh the entity
em.refresh(p);
assertTrue(p == p2);
assertEquals(10, p.getContactDetails().size());
assertEquals(10, p2.getContactDetails().size());
em.getTransaction().commit();
em.close();
factory.close();
The behaviour of clear() is explained in its javadoc:
Clear the persistence context, causing all managed entities to become detached. Changes made to entities that have not been flushed to the database will not be persisted.
That is, removal of contactInfo is not persisted.
ContactInfo is not getting removed from the database because you remove the relationship between ContactDetails and ContactInfo, but not ContactInfo itself. If you want to remove it, you need either do it explicitly with remove() or specify orphanRemoval = true on the relationship.

Convince entity context (EF1) to populate entity references

I have an entity with self reference (generated by Entity Designer):
public MyEntity: EntityObject
{
// only relevant stuff here
public int Id { get...; set...; }
public MyEntity Parent { get...; set...; }
public EntityCollection<MyEntity> Children { get...; set...; }
...
}
I've written a stored procedure that returns a subtree of nodes (not just immediate children) from the table and returns a list of MyEntity objects. I'm using a stored proc to avoid lazy loading of an arbitrary deep tree. This way I get relevant subtree nodes back from the DB in a single call.
List<MyEntity> nodes = context.GetSubtree(rootId).ToList();
All fine. But when I check nodes[0].Children, its Count equals to 0. But if I debug and check context.MyEntities.Results view, Children enumerations get populated. Checking my result reveals children under my node[0].
How can I programaticaly force my entity context to do in-memory magic and put correct references on Parent and Children properties?
UPDATE 1
I've tried calling
context.Refresh(ClientWins, nodes);
after my GetSubtree() call which does set relations properly, but fetches same nodes again from the DB. It's still just a workaround. But better than getting the whole set with context.MyEntities().ToList().
UPDATE 2
I've reliably solved this by using EF Extensions project. Check my answer below.
You need to assign one end of the relationship. First, divide the collection:
var root = nodes.Where(n => n.Id == rootId).First();
var children = nodes.Where(n => n.Id != rootId);
Now, fix up the relationship.
In your case, you'd do either:
foreach (var c in children)
{
c.Parent = root;
}
...or:
foreach (var c in children)
{
root.Children.Add(c);
}
It doesn't matter which.
Note that this marks the entities as modfied. You'll need to change that if you intend to call SaveChanges on the context and don't want this saved.
The REAL solution
Based on this article (read text under The problem), navigation properties are obviously not populated/updated when one uses stored procedures to return data.
But there's a nice manual solution to this. Use EF Extensions project and write your own entity Materilizer<EntityType> where you can correctly set navigation properties like this:
...
ParentReference = {
EntityKey = new EntityKey(
"EntityContextName.ParentEntitySetname",
new[] {
new EntityKeyMember(
"ParentEntityIdPropertyName",
reader.Field<int>("FKNameFromSP")
)
})
}
...
And that's it. Calling stored procedure will return correct data, and entity object instances will be correctly related to eachother. I advise you check EF Extensions' samples, where you will find lots of nice things.