JPQL can not prevent cache for JPQL query - jpa

Two (JSF + JPA + EclipseLink + MySQL) applications share the same database. One application runs a scheduled task where the other one creates tasks for schedules. The tasks created by the first application is collected by queries in the second one without any issue. The second application updates fields in the task, but the changes done by the second application is not refreshed when queried by JPQL.
I have added QueryHints.CACHE_USAGE as CacheUsage.DoNotCheckCache, still, the latest updates are not reflected in the query results.
The code is given below.
How can I get the latest updates done to the database from a JPQL query?
public List<T> findByJpql(String jpql, Map<String, Object> parameters, boolean withoutCache) {
TypedQuery<T> qry = getEntityManager().createQuery(jpql, entityClass);
Set s = parameters.entrySet();
Iterator it = s.iterator();
while (it.hasNext()) {
Map.Entry m = (Map.Entry) it.next();
String pPara = (String) m.getKey();
if (m.getValue() instanceof Date) {
Date pVal = (Date) m.getValue();
qry.setParameter(pPara, pVal, TemporalType.DATE);
} else {
Object pVal = (Object) m.getValue();
qry.setParameter(pPara, pVal);
}
}
if(withoutCache){
qry.setHint(QueryHints.CACHE_USAGE, CacheUsage.DoNotCheckCache);
}
return qry.getResultList();
}

The CacheUsage settings affect what EclipseLink can query using what is in memory, but not what happens after it goes to the database for results.
It seems you don't want to out right avoid the cache, but refresh it I assume so the latest changes can be visible. This is a very common situation when multiple apps and levels of caching are involved, so there are many different solutions you might want to look into such as manual invalidation or even if both apps are JPA based, cache coordination (so one app can send an invalidation even to the other). Or you can control this on specific queries with the "eclipselink.refresh" query hint, which will force the query to reload the data within the cached object with what is returned from the database. Please take care with it, as if used in a local EntityManager, any modified entities that would be returned by the query will also be refreshed and changes lost
References for caching:
https://wiki.eclipse.org/EclipseLink/UserGuide/JPA/Basic_JPA_Development/Caching
https://www.eclipse.org/eclipselink/documentation/2.6/concepts/cache010.htm

Make the Entity not to depend on cache by adding the following lines.
#Cache(
type=CacheType.NONE, // Cache nothing
expiry=0,
alwaysRefresh=true
)

Related

Is it possible to improve EF6 WarmUp time?

I have an application in which I verify the following behavior: the first requests after a long period of inactivity take a long time, and timeout sometimes.
Is it possible to control how the entity framework manages dispose of the objects? Is it possible mark some Entities to never be disposed?
...in order to avoid/improve the warmup time?
Regards,
The reasons that similar queries will have an improved response time are manifold.
Most Database Management Systems cache parts of the fetched data, so that similar queries in the near future will be faster. If you do query Teachers with their Students, then the Teachers table will be joined with the Students table. This join result is quite often cached for a while. The next query for Teachers with their Students will reuse this join result and thus become faster
DbContext caches queried object. If you select a Single teacher, or Find one, it is kept in local memory. This is to be able to detect which items are changed when you call SaveChanges. If you Find the same Teacher again, this query will be faster. I'm not sure if the same happens if you query 1000 Teachers.
When you create a DbContext object, the initializer is checked to see if the model has been changed or not.
So it might seem wise not to Dispose() a created DbContext, yet you see that most people keep the DbContext alive for a fairly short time:
using (var dbContext = new MyDbContext(...))
{
var fetchedTeacher = dbContext.Teachers
.Where(teacher => teacher.Id = ...)
.Select(teacher => new
{
Id = teacher.Id,
Name = teacher.Name,
Students = teacher.Students.ToList(),
})
.FirstOrDefault();
return fetchedTeacher;
}
// DbContext is Disposed()
At first glance it would seem that it would be better to keep the DbContext alive. If someone asks for the same Teacher, the DbContext wouldn't have to ask the database for it, it could return the local Teacher..
However, keeping a DbContext alive might cause that you get the wrong data. If someone else changes the Teacher between your first and second query for this Teacher, you would get the old Teacher data.
Hence it is wise to keep the life time of a DbContext as short as possible.
Is there nothing I can do to improve the speed of the first query?
Yes you can!
One of the first things you could do is to set the initialize of your database such that it doesn't check the existence and model of the database. Of course you can only do this when you are fairly sure that your database exists and hasn't changed.
// constructor; disables initializer
public SchoolDBContext() : base(...)
{
//Disable initializer
Database.SetInitializer<SchoolDBContext>(null);
}
Another thing could be, if you already have fetched your object to update the database, and you are sure that no one else changed the object, you can Attach it, instead of fetching it again, as is shown in this question
Normal usage:
// update the name of the teacher with teacherId
void ChangeTeacherName(int teacherId, string name)
{
using (var dbContext = new SchoolContext(...))
{
// fetch the teacher, change the name and save
Teacher fetchedTeacher = dbContext.Teachers.Find(teacherId);
fetchedTeader.Name = name;
dbContext.SaveChanges();
}
}
Using Attach to update an earlier fetched Teacher:
void ChangeTeacherName (Teacher teacher, string name)
{
using (var dbContext = new SchoolContext(...))
{
dbContext.Teachers.Attach(teacher);
dbContext.Entry(teacher).Property(t => t.Name).IsModified = true;
dbContext.SaveChanges();
}
}
Using this method doesn't require to fetch the Teacher again. During SaveChanges the value of IsModified of all properties of all Attached items is checked. If needed they will be updated.

Entity Framework 6: is there a way to iterate through a table without holding each row in memory

I would like to be able to iterate through every row in an entity table without holding every row in memory. This is a read only operation and every row can be discarded after being processed.
If there is a way to discard the row after processing that would be fine. I know that this can be achieved using a DataReader (which is outside the scope of EF), but can it be achieved within EF?
Or is there a way to obtain a DataReader from within EF without directly using SQL?
More detailed example:
Using EF I can code:
foreach (Quote in context.Quotes)
sw.WriteLine(sw.QuoteId.ToString()+","+sw.Quotation);
but to achieve the same result with a DataReader I need to code:
// get the connection to the database
SqlConnection connection = context.Database.Connection as SqlConnection;
// open a new connection to the database
connection.Open();
// get a DataReader for our table
SqlCommand command = new SqlCommand(context.Quotes.ToString(), connection);
SqlDataReader dr = command.ExecuteReader();
// get a recipient for our database fields
object[] L = new object[dr.FieldCount];
while (dr.Read())
{
dr.GetValues(L);
sw.WriteLine(((int)L[0]).ToString() + "," + (string)L[1]);
}
The difference is that the former runs out of memory (because it is pulling in the entire table in the client memory) and the later runs to completion (and is much faster) because it only retains a single row in memory at any one time.
But equally importantly the latter example loses the Strong Typing of EF and should the database change, errors can be introduced.
Hence, my question: can we get a similar result with strongly typed rows coming back in EF?
Based on your last comment, I'm still confused. Take a look at both of below code.
EF
using (var ctx = new AppContext())
{
foreach (var order in ctx.Orders)
{
Console.WriteLine(order.Date);
}
}
Data Reader
var constr = ConfigurationManager.ConnectionStrings["AppContext"].ConnectionString;
using (var con = new SqlConnection(constr))
{
con.Open();
var cmd = new SqlCommand("select * from dbo.Orders", con);
var reader = cmd.ExecuteReader();
while (reader.Read())
{
Console.WriteLine(reader["Date"]);
}
}
Even though EF has few initial query, both of them execute similar query that can be seen from profiler..
I haven't tested it, but try foreach (Quote L in context.Quotes.AsNoTracking()) {...}. .AsNoTracking() should not put entities in cache so I assume they will be consumed by GC when they out of the scope.
Another option is to use context.Entry(quote).State = EntityState.Detached; in the foreach loop. Should have the similar effect as the option 1.
Third option (should definitely work, but require more coding) would be to implement batch processing (select top N entities, process, select next top N). In this case make sure that you dispose and create new context every iteration (so GC can eat it:)) and use proper OrderBy() in the query.
You need to use an EntityDataReader, which behaves in a way similar to a traditional ADO.NET DataReader.
The problem is that, to do so, you need to use ObjectContext instead of DbContext, which makes things harder.
See this SO answer, not the acepted one: How can I return a datareader when using Entity Framework 4?
Even if this referes to EF4, in EF6 things work in the same way. Usually an ORM is not intended for streaming data, that's why this functionality is so hidden.
You can also look at this project: Entity Framework (Linq to Entities) to IDataReader Adapter
I have done this by pages.
And cleaning the Context after each page load.
Sample:
Load first 50 rows
Iterate over them
Clean the Context or create a new one.
Load second 50 rows
...
Clean the Context = Set all its Entries as Detached.

Using of bulk upate via Query.executeUpdate() vs. object's set methods for multiple entities

Sometime, in he case when we have a list of entity IDs, I have to update a set of fields of a collection of related of passed IDs entities and I am wandering in which cases is better to use the standard way via 1-st: loading of all entities and 2-nd: calling related set() methods :
List<Long> ids = ....;
String par = "example";
List<User> users = USER_DAO.getUsers(ids);
for(User user : users) {
user.setField(par);
}
or the other way is via executing of a bulk update operations like :
Query query = em.createQuery("UPDATE User user SET user.field =:1? WHERE user.id IN : 2?");
int rowCount = query.executeUpdate();
Do you have some knowledge and/or investigations of that issue ?
Or the cases, when first way is better than the second way or vs. ?
All recommendations are welcome,
Thanks,
SImeon
The second way will perform better as there will be less sql's executed on the database. In your first case two sql's will be executed for each user(one for the select and one for the update). In your first case you will also take a performance hit while hibernate maps the sql to the object.

JPA eclipselink cache refresh automatically

Currently using eclipselink JPA provider for accessing backend entities. I'm using namedqueries for accessing data and using the below options on query caching.
#NamedQueries({
#NamedQuery(name = Supplier.FIND_ALL, query = "select o from Supplier o",hints={
#QueryHint(name=QueryHints.READ_ONLY, value=HintValues.TRUE),
#QueryHint(name = QueryHints.QUERY_RESULTS_CACHE, value = HintValues.TRUE),
#QueryHint(name = QueryHints.CACHE_STATMENT, value = HintValues.TRUE),
#QueryHint(name = QueryHints.CACHE_STORE_MODE, value = "REFRESH"),
#QueryHint(name = QueryHints.CACHE_RETRIEVE_MODE,value=CacheUsage.CheckCacheThenDatabase),
}),
})
Also i'm using the below Cache options on the entity as well.
#Cache(refreshOnlyIfNewer=true,
coordinationType=CacheCoordinationType.SEND_OBJECT_CHANGES,alwaysRefresh=true)
The query seems to be taking sometime for the first time, but is pretty fast for further retrievals ( b'cozs of QueryHints.QUERY_RESULTS_CACHE, value = HintValues.TRUE ). But it seems any changes to the database subsequently, are not getting reflected in the output. It seems the cache is not getting refreshed and updated changes to the database are not reflected in the output.
Require help on the same.
Thanks,
Krishna
You settings don't make a lot of sense.
QueryHints.QUERY_RESULTS_CACHE is the correct way to enable the query cache. But you should not set the others,
QueryHints.CACHE_STATMENT - this is JDBC statement caching, very odd to be setting this on a query, normally this is configured for all statements in the DatabaseSource, or persistence unit config if using EclipseLink connection pooling.
QueryHints.CACHE_STORE_MODE - I'm not sure this makes sense, you cannot refresh and cache the query.
QueryHints.CACHE_RETRIEVE_MODE,value=CacheUsage.CheckCacheThenDatabase, this makes no sense, CacheUsage is not for this property, and CheckCacheThenDatabase makes no sense for a query that is using a query cache.
EclipseLink has two types of caches, the "object cache" (by Id) and the "query cache" (by query name and parameters). When you enable a query cache, the query results will be cached until they expire. By default there is no expiry, but you can configure it, or manually clear the query cache. The query cache will not be updated with database changes.
See,
http://wiki.eclipse.org/EclipseLink/UserGuide/JPA/Basic_JPA_Development/Caching/Query_Cache

MVC 1.0 + EF: Does db.EntitySet.where(something) still return all rows in table?

In a repository, I do this:
public AgenciesDonor FindPrimary(Guid donorId) {
return db.AgenciesDonorSet.Include("DonorPanels").Include("PriceAdjustments").Include("Donors").First(x => x.Donors.DonorId == donorId && x.IsPrimary);
}
then down in another method in the same repository, this:
AgenciesDonor oldPrimary = this.FindPrimary(donorId);
In the debugger, the resultsview shows all records in that table, but:
oldPrimary.Count();
is 1 (which it should be).
Why am I seeing all table entries retrieved, and not just 1? I thought row filtering was done in the DB.
If db.EntitySet really does fetch everything to the client, what's the right way to keep the client data-lite using EF? Fetching all rows won't scale for what I'm doing.
You will see everything if you hover over the AgenciesDonorSet because LINQ to Entities (or SQL) uses delayed execution. When the query is actually executed, it is just retrieving the count.
If you want to view the SQL being generated for any query, you can add this bit of code:
var query = queryObj as ObjectQuery; //assign your query to queryObj rather than returning it immediately
if (query != null)
{
System.Diagnostics.Trace.WriteLine(context);
System.Diagnostics.Trace.WriteLine(query.ToTraceString());
}
Entity Set does not implement IQueryable, so the extension methods that you're using are IEnumerable extension methods. See here:
http://social.msdn.microsoft.com/forums/en-US/linqprojectgeneral/thread/121ec4e8-ce40-49e0-b715-75a5bd0063dc/
I agree that this is stupid, and I'm surprised that more people haven't complained about it. The official reason:
The design reason for not making
EntitySet IQueryable is because
there's not a clean way to reconcile
Add\Remove on EntitySet with
IQueryable's filtering and
transformation ability.