Using of bulk upate via Query.executeUpdate() vs. object's set methods for multiple entities - jpa

Sometime, in he case when we have a list of entity IDs, I have to update a set of fields of a collection of related of passed IDs entities and I am wandering in which cases is better to use the standard way via 1-st: loading of all entities and 2-nd: calling related set() methods :
List<Long> ids = ....;
String par = "example";
List<User> users = USER_DAO.getUsers(ids);
for(User user : users) {
user.setField(par);
}
or the other way is via executing of a bulk update operations like :
Query query = em.createQuery("UPDATE User user SET user.field =:1? WHERE user.id IN : 2?");
int rowCount = query.executeUpdate();
Do you have some knowledge and/or investigations of that issue ?
Or the cases, when first way is better than the second way or vs. ?
All recommendations are welcome,
Thanks,
SImeon

The second way will perform better as there will be less sql's executed on the database. In your first case two sql's will be executed for each user(one for the select and one for the update). In your first case you will also take a performance hit while hibernate maps the sql to the object.

Related

JPQL can not prevent cache for JPQL query

Two (JSF + JPA + EclipseLink + MySQL) applications share the same database. One application runs a scheduled task where the other one creates tasks for schedules. The tasks created by the first application is collected by queries in the second one without any issue. The second application updates fields in the task, but the changes done by the second application is not refreshed when queried by JPQL.
I have added QueryHints.CACHE_USAGE as CacheUsage.DoNotCheckCache, still, the latest updates are not reflected in the query results.
The code is given below.
How can I get the latest updates done to the database from a JPQL query?
public List<T> findByJpql(String jpql, Map<String, Object> parameters, boolean withoutCache) {
TypedQuery<T> qry = getEntityManager().createQuery(jpql, entityClass);
Set s = parameters.entrySet();
Iterator it = s.iterator();
while (it.hasNext()) {
Map.Entry m = (Map.Entry) it.next();
String pPara = (String) m.getKey();
if (m.getValue() instanceof Date) {
Date pVal = (Date) m.getValue();
qry.setParameter(pPara, pVal, TemporalType.DATE);
} else {
Object pVal = (Object) m.getValue();
qry.setParameter(pPara, pVal);
}
}
if(withoutCache){
qry.setHint(QueryHints.CACHE_USAGE, CacheUsage.DoNotCheckCache);
}
return qry.getResultList();
}
The CacheUsage settings affect what EclipseLink can query using what is in memory, but not what happens after it goes to the database for results.
It seems you don't want to out right avoid the cache, but refresh it I assume so the latest changes can be visible. This is a very common situation when multiple apps and levels of caching are involved, so there are many different solutions you might want to look into such as manual invalidation or even if both apps are JPA based, cache coordination (so one app can send an invalidation even to the other). Or you can control this on specific queries with the "eclipselink.refresh" query hint, which will force the query to reload the data within the cached object with what is returned from the database. Please take care with it, as if used in a local EntityManager, any modified entities that would be returned by the query will also be refreshed and changes lost
References for caching:
https://wiki.eclipse.org/EclipseLink/UserGuide/JPA/Basic_JPA_Development/Caching
https://www.eclipse.org/eclipselink/documentation/2.6/concepts/cache010.htm
Make the Entity not to depend on cache by adding the following lines.
#Cache(
type=CacheType.NONE, // Cache nothing
expiry=0,
alwaysRefresh=true
)

Spring JPA Specification API with custom Query and custom response Object. Is this possible?

I have researched this for a few days but can't seem to find the right information.
Here is what I need, I have a Database, with multiple tables, I need to join a few tables together to make a sort of "search" API. I have to implement the ability to dynamically search fields (from various tables in the query), sortable, with pagination.
I have found that I cannot combine the #Query annotation with Specification API, and I looked into using the specification API to do the joins I needed but, the problem is the root must be one table/repository.
For example:
If I have a users table that has to join on addresses, phone_numbers, and preferences
the base repository will be UserResposiory and it will return the User entity model, but I need it to return a custom DTO
AccountUserDTO which contains fields from the User, Address, PhoneNumber, and Preference entities.
Would anyone know if this is possible at all??
I am at wits end here and I really want to build this the correct way.
Cheers!
You may do this way:
Build hql query as an string, depend on how the filter condition is requested, you can build the corresponding query, eg:
if (hasParam(searchName)) {
queryString = queryString + " myEntity.name = :queryName"
}
Query query = session.createQuery(queryString);
and the parameter providing
if (hasParam(searchName)) {
query.setParameter("queryName", searchName);
}
...
and execute it.
To create a customized object, the easiest way is treating the object as an array of field:
Query query = session.createQuery("select m.f1, m.f2, m.f3 from myTable m");
List managers = query.list();
Object[] manager = (Object[]) managers.get(0); //first row
System.out.println(manager[0]) //f1
System.out.println(manager[1]) //f2
System.out.println(manager[2]) //f3
There is also some other solution to select, such as
String query = "select new mypackage.myclass(m.f1, m.f2, m.f3) from myTable m";
-> And when execute the above query, it will return a list of object.
Or to be simpler, make your own view in db and map it to one entity.

Long Running Query in Entity Framework with multiple table joins

I have a query that joins about 10 tables some that are self referencing tables. I use an "IN" statement for the conditional on the ID column (indexed) of the top most table.
var aryOrderId = DetermineOrdersToGet(); //Logic to determine what orderids to get
var result = dbContext.Orders.Where(o=>aryOrderId.Contains(o.id)
.Include(o=>o.Customer)
.Include(o=>o.Items.Select(oi=>oi.ItemAttributes))
.Include(o=>o.Items.Select(oi=>oi.Dimensions))
.Include(o=>o.CustomOptions.Select(oc => oc.CustomOptions1))
.....A Bunch more.....
.ToList();
I would like to figure out a way to speed this up without redesigning my tables and flattening out the structure. Currently 50-200 records take 10-20 seconds.
This data can be read only. I don't need to update these records.
Can I convert this to a stored procedure?
How hard is this to do?
Will I be able to get noticeable performance gains?
One of the slower parts of the database query is the transport of the selected data from the DBMS to your local process. Hence it is wise to select only the properties you actually plan to use.
For example, it seems that an Order has zero or more ItemAttributes. Evey ItemAttribute belongs to exactly one Order, using a foreign key OrderId.
If you fetch all Orders with Id in ArryOrderId, each order with its thousand ItemAttributes, you know that every ItemAttribute will have a foreign key OrderId with the same value as the Id of the Order that it belongs to. It is a waste to send 1000 times the same value.
When querying data using entity framework, always use Select. Select only the properties yo actually plan to use. Only use Include if you intend to change the fetched objects.
var result = dbContext.Orders
.Where(order=>aryOrderId.Contains(order.id)
.Select(order => new
{ // select only the properties you plan to use:
Id = order.Id,
...
Customer = order.Customer.Select(customer => new
{ // again: only the properties you plan to use
Id = order.Customer.Id,
Name = order.Customer.Name,
...
},
ItemAttributes = order.ItemAttributes.Select(itemAttribute => new
{
...
})
.ToList(),
Dimensions = order.Dimensions.Select(dimension => new
{
...
})
.ToList(),
....A Bunch more.....
})
.ToList();
If after selecting only the properties that you actually plan to use, the query still takes too long, think again: do I really need all these properties.
Another solution to limit the execution time is fetching the date 'per page', using Skip / Take. The danger is of course that when you are viewing page 10, the data of page 1 might be changed in a way that page 10 should be interpreted differently.
As jtate mentions, if you don't need everything from the joined tables, don't include them. Instead, utilize .Select() to retrieve just the data you want from the entity and it's associated relationships.
I.e.
var query = dbContext.Orders
.Where(x => aryOrderId.Contains(x => x.OrderId))
.Select(x => new
{
x.OrderId,
x.OrderNumber,
OrderItems = x.Items.Select(i => new
{
i.ItemId,
Attributes = i.Attributes.Select(a => a.AttributeName).ToList(),
Dimensions = i.Dimensions.Select(d => new {d.DimensionId, d.Name}).ToList(),
}).ToList(),
// ...
}).ToList();
You can structure the query, or queries however you like to find an optimal result.
Alternatively you can consider utilizing a view on the database and binding an entity to the view. This option works well for read-only views of data. Provided you include the relevant IDs you can always retrieve the applicable "real" entities at any time to load a details page or perform an action/update against the entity.
Answering your 3 questions. Yes, you can use a stored procedure and that's what I would do in this situation. It is not hard at all; EF makes it quite simple. You can either have it return a new complex type or you can map it to an entity. Since you're saying the data can be readonly, you probably are okay with a basic function import returning a complex type (EF's default behavior). Either way, you will have noticeable performance gains.
For db-first, see http://www.entityframeworktutorial.net/stored-procedure-in-entity-framework.aspx
Basically, you'll follow these steps.
Create the stored procedure on your database
Update the model from the database. When it asks which objects to include, you should be able to select your stored procedure.
Click Finish. EF will generate a complex type that has all the properties returned by your stored procedure, and it will add a signature to your context for executing the stored procedure, so it can be called like this: var results = myContext.myProcedure(param1, param2); There are screenshots of this at the link above.
You can also go in and modify the model to customize the details, such as the name of the complex type and the name of the function (by default the function will match the name of the SP and will return an ObjectResult<T> where T is your complex type, which will be the name of the procedure with "_Result" as a suffix).

When to use EF navigation properties over manual indexed queries?

I'm wondering if there's any time when performing a query like this:
var documentId = report.DocumentId;
return db.Documents.Find(documentId);
Would be better or worse than setting up a virtual/nav property and doing this:
return report.Document
Where DocumentId is the PK on that table.
Are they the same thing?
I'd say if it's the document related to the report that you're after then there's no reason not to do report.Document. Both your options should perform the same query. What you could do which may provide some benefit is to use Include:
db.Reports.Include(x => x.Document).First(report => report.Id == id)
This would allow you to retrieve the report and its related document in a single query. i.e. the document wouldn't be lazy loaded whenever your code reached report.Document.

Attempting to use EF/Linq to Entities for dynamic querying and CRUD operations

(as advised re-posting this question here... originally posted in msdn forum)
I am striving to write a "generic" routine for some simple CRUD operations using EF/Linq to Entities. I'm working in ASP.NET (C# or VB).
I have looked at:
Getting a reference to a dynamically selected table with "GetObjectByKey" (But I don't want anything from cache. I want data from database. Seems like not what this function is intended for).
CRM Dynamic Entities (here you can pass a tablename string to query) looked like the approach I am looking for but I don't get the idea that this CRM effort is necessarily staying current (?) and/or has much assurance for the future??
I looked at various ways of drilling thru Namespaces/Objects to get to where I could pass a TableName parameter into the oft used query syntax var query = (from c in context.C_Contacts select c); (for example) where somehow I could swap out the "C_Contacts" TEntity depending on which table I want to work with. But not finding a way to do this ??
Slightly over-simplyfing, I just want to be able to pass a tablename parameter and in some cases some associated fieldnames and values (perhaps in a generic object?) to my routine and then let that routine dynamically plug into LINQ to Entity data context/model and do some standard "select all" operations for parameter table or do a delete to parameter table based on a generic record id. I'm trying to avoid calling the various different automatically generated L2E methods based on tablename etc...instead just trying to drill into the data context and ultimately the L2E query syntax for dynamically passed table/field names.
Has anyone found any successful/efficient approaches for doing this? Any ideas, links, examples?
The DbContext object has a generic Set() method. This will give you
from c in context.Set<Contact>() select c
Here's method when starting from a string:
public void Test()
{
dynamic entity = null;
Type type = Type.GetType("Contract");
entity = Activator.CreateInstance(type);
ProcessType(entity);
}
public void ProcessType<TEntity>(TEntity instance)
where TEntity : class
{
var result =
from item in this.Set<TEntity>()
select item;
//do stuff with the result
//passing back to the caller can get more complicated
//but passing it on will be fine ...
}