How to avoid N+1 queries with Spring Data REST Projections?

How to avoid N+1 queries with Spring Data REST Projections? - spring-data-jpa

I have been prototyping my new application in Spring Data REST backed by Spring Data JPA & Hibernate, which has been a fantastic productivity booster for my team, but as the data model becomes more complex the performance is going down the tubes. Looking at the executed SQL, I see two separate but related problems:
When using a Projection with only a few properties to reduce the size of my payload, SDR is still loading the entire entity graph, with all the overhead that incurs. EDIT: filed DATAREST-1089
There seems to be no way to specify eager loading using JPA, as SDR auto-generates the repository methods so I can't add #EntityGraph to them. (and per DATAREST-905 below, even that doesn't work) EDIT: addressed in Cepr0's answer below, though this can only be applied to each finder method once. See DATAJPA-749
I have one key model that I use several different projections of depending on the context (list page, view page, autocomplete, related item page, etc), so implementing one custom ResourceProcessor doesn't seem like a solution.)
Has anyone found a way around these problems? Otherwise anyone with a non-trivial object graph will see performance deteriorate drastically as their model grows.
My research:
How do I avoid n+1 queries with Spring Data Rest? (from 2013)
https://jira.spring.io/browse/DATAJPA-466
(
Add support for lazy loading configuration via JPA 2.1 fetch-/loadgraph.)
https://jira.spring.io/browse/DATAREST-905 (
No way to avoid loading all child relations in spring-data-rest?) (2016, unanswered)

To fight with 1+N issue I use the following two approaches:
#EntityGraph
I use '#EntityGraph' annotation in Repository for findAll method. Just override it:
#Override
#EntityGraph(attributePaths = {"author", "publisher"})
Page<Book> findAll(Pageable pageable);
This approach is suitable for all "reading" methods of Repository.
Cache
I use cache to reduce the impact of 1+N issue for complex Projections.
Suppose we have Book entity to store the book data and Reading entity to store the information about the number of readings of a specific Book and its reader rating. To get this data we can make a Projection like this:
#Projection(name = "bookRating", types = Book.class)
public interface WithRatings {
String getTitle();
String getIsbn();
#Value("#{#readingRepo.getBookRatings(target)}")
Ratings getRatings();
}
Where readingRepo.getBookRatings is the method of ReadingRepository:
#RestResource(exported = false)
#Query("select avg(r.rating) as rating, count(r) as readings from Reading r where r.book = ?1")
Ratings getBookRatings(Book book);
It also return a projection that store "rating" info:
#JsonSerialize(as = Ratings.class)
public interface Ratings {
#JsonProperty("rating")
Float getRating();
#JsonProperty("readings")
Integer getReadings();
}
The request of /books?projection=bookRating will cause the invocation of readingRepo.getBookRatings for every Book which will lead to redundant N queries.
To reduce the impact of this we can use the cache:
Preparing the cache in the SpringBootApplication class:
#SpringBootApplication
#EnableCaching
public class Application {
//...
#Bean
public CacheManager cacheManager() {
Cache bookRatings = new ConcurrentMapCache("bookRatings");
SimpleCacheManager manager = new SimpleCacheManager();
manager.setCaches(Collections.singletonList(bookRatings));
return manager;
}
}
Then adding a corresponding annotation to readingRepo.getBookRatings method:
#Cacheable(value = "bookRatings", key = "#a0.id")
#RestResource(exported = false)
#Query("select avg(r.rating) as rating, count(r) as readings from Reading r where r.book = ?1")
Ratings getBookRatings(Book book);
And implementing the cache eviction when Book data is updated:
#RepositoryEventHandler(Reading.class)
public class ReadingEventHandler {
private final #NonNull CacheManager cacheManager;
#HandleAfterCreate
#HandleAfterSave
#HandleAfterDelete
public void evictCaches(Reading reading) {
Book book = reading.getBook();
cacheManager.getCache("bookRatings").evict(book.getId());
}
}
Now all subsequent requests of /books?projection=bookRating will get rating data from our cache and will not cause redundant requests to the database.
More info and working example is here.

Related

Simple Tagging Implementation with Spring Data JPA/Rest

I am trying to come up with a way of implementing tags for my entity that works well for me and need some help in the process. Let me write down some requirements I have in mind:
Firstly, I would like tags to show in entities as a list of strings like this:
{
"tags": ["foo", "bar"]
}
Secondly, I need to be able to retrieve a set of available tags across all entities so that users can easily choose from existing tags.
The 2nd requirement could be achieved by creating a Tag entity with the value of the Tag as the #Id. But that would make the tags property in my entity a relation that requires an extra GET operation to fetch. I could work with a getter method that resolves all the Tags and returns only a list of strings, but I see two disadvantages in that: 1. The representation as a list of strings suggests you could store tags by POSTing them in that way which is not the case. 2. The process of creating an entity requires to create all the Tags via a /tags endpoint first. That seem rather complicated for such a simple thing.
Also, I think I read somewhere that you shouldn't create a repository for an entity that isn't standalone. Would I create a Tag and only a Tag at any point in time? Nope.
I could store the tags as an #ElementCollection in my entity. In this case I don't know how to fulfill the 2nd requirement, though.
#ElementCollection
private Set<String> tags;
I made a simple test via EntityManager but it looks like I cannot query things that are not an #Entity in a result set.
#RestController
#RequestMapping("/tagList")
#RequiredArgsConstructor(onConstructor = #__(#Autowired))
public class TagListController implements RepresentationModelProcessor<RepositoryLinksResource> {
#PersistenceContext
private final #NonNull EntityManager entityManager;
#RequestMapping(method = RequestMethod.GET)
public ResponseEntity<EntityModel<TagList>> get() {
System.out.println(entityManager.createQuery("SELECT t.tags FROM Training t").getFirstResult());
EntityModel<TagList> model = EntityModel.of(new TagList(Set.of("foo", "bar")));
model.add(linkTo(methodOn(TagListController.class).get()).withSelfRel());
return ResponseEntity.ok(model);
}
}
org.hibernate.QueryException: not an entity
Does anyone know a smart way?

The representation as a list of strings suggests you could store tags by POSTing them in that way which is not the case
This is precisely the issue with using entities as REST resource representations. They work fine until it turns out the internal representation (entity) does not match the external representation (the missing DTO).
However, it would probably make most sense performance-wise to simply use an #ElementCollection like you mentioned, because you then don't have the double join with a join table for the many-to-many association (you could also use a one-to-many association where the parent entity and the tag value are both part of the #Id to avoid a join table, but I'm not sure it's convenient to work with. Probably better to just put a UNIQUE(parent_id, TAG) constraint on the collection table, if you need it). Regarding the not an entity error, you would need to use a native query. Assuming that you have #ElementCollection #CollectionTable(name = "TAGS") #Column(name = "TAG") on tags, then SELECT DISTINCT(TAG) FROM TAGS should do the job.
(as a side note, the DISTINCT part of the query will surely introduce some performance penalty, but I would assume the result of that query is a good candidate for caching)

JPQL can not prevent cache for JPQL query

Two (JSF + JPA + EclipseLink + MySQL) applications share the same database. One application runs a scheduled task where the other one creates tasks for schedules. The tasks created by the first application is collected by queries in the second one without any issue. The second application updates fields in the task, but the changes done by the second application is not refreshed when queried by JPQL.
I have added QueryHints.CACHE_USAGE as CacheUsage.DoNotCheckCache, still, the latest updates are not reflected in the query results.
The code is given below.
How can I get the latest updates done to the database from a JPQL query?
public List<T> findByJpql(String jpql, Map<String, Object> parameters, boolean withoutCache) {
TypedQuery<T> qry = getEntityManager().createQuery(jpql, entityClass);
Set s = parameters.entrySet();
Iterator it = s.iterator();
while (it.hasNext()) {
Map.Entry m = (Map.Entry) it.next();
String pPara = (String) m.getKey();
if (m.getValue() instanceof Date) {
Date pVal = (Date) m.getValue();
qry.setParameter(pPara, pVal, TemporalType.DATE);
} else {
Object pVal = (Object) m.getValue();
qry.setParameter(pPara, pVal);
}
}
if(withoutCache){
qry.setHint(QueryHints.CACHE_USAGE, CacheUsage.DoNotCheckCache);
}
return qry.getResultList();
}

The CacheUsage settings affect what EclipseLink can query using what is in memory, but not what happens after it goes to the database for results.
It seems you don't want to out right avoid the cache, but refresh it I assume so the latest changes can be visible. This is a very common situation when multiple apps and levels of caching are involved, so there are many different solutions you might want to look into such as manual invalidation or even if both apps are JPA based, cache coordination (so one app can send an invalidation even to the other). Or you can control this on specific queries with the "eclipselink.refresh" query hint, which will force the query to reload the data within the cached object with what is returned from the database. Please take care with it, as if used in a local EntityManager, any modified entities that would be returned by the query will also be refreshed and changes lost
References for caching:
https://wiki.eclipse.org/EclipseLink/UserGuide/JPA/Basic_JPA_Development/Caching
https://www.eclipse.org/eclipselink/documentation/2.6/concepts/cache010.htm

Make the Entity not to depend on cache by adding the following lines.
#Cache(
type=CacheType.NONE, // Cache nothing
expiry=0,
alwaysRefresh=true
)

Spring Data JPA finder for dynamic fields as Map

My requirement is to have few custom fields in the domain objects. These fields may vary as per the clients.
We are using Spring Data JPA to execute finders. Spring data implicitly provides finders for static fields of the domain and can also handle finders for the fields in the object graph.
I want to know if there is a way to find data on the custom fields? Can someone suggest me a strategy to achieve the same. Below is the sample of my domain class.
public class Employee{
private String name;
private String age;
private Map customeFields; (May vary as per client)
}
I was thinking of overriding QueryLookupStrategy and create my CustomJpaQuery on lines of PartTreeJpaQuery to achieve it. Is there any better approach? Does spring data jpa provides an easy mechanism to override query creation mechanism?

If you are using hibernate (not sure about other JPA implementations) you may add methods with #Query annotations like this:
#Query("select e from Employee as e where e.customeFields[:key] = :value")
List<Employee> findSomeHow(#Param("key") String key, #Param("value") String value)

Entity Framework, performance tuning

I have entity "Ideas", which has child entity collection "ChildIdeas". I need to load list of ideas and count of "ChildIdeas" (only count!).
I can do:
eager loading
from i in _dataContext.Ideas.Include("ChildIdeas") ...
advantages : all necessary data got by one request;
disadvantages : load unnecessary data. I need only count of ChildIdeas, not full ChildIdeas list
Explicit loading
from i in _dataContext.Ideas ...
idea.ChildIdeas.Loading()
advantages : none;
disadvantages : many requests (ideas.count + 1) instead of one, load unnecessary data
Independent requests
from i in _dataContext.Ideas ...
_repository.GetCountChildIdeas(idea.ID);
advantages : load only necessary data;
disadvantages : many requests (ideas.count + 1) instead of one
all 3 types have disadvantages. Maybe is exist any way to load only necessary data? If yes - what is it, if no - which way is the best for this case?
[ADDED]
after load testing (for 1 user) I got Page Load (in sec):
eager Child Ideas - 1.31 sec
explicit Child Ideas - 1.19 sec
external requests - 1.14 sec
so, eager way for my case is the worst... Why even explicit way is better?

You should use projection. Count of child ideas is not part of persisted Idea entity so create new non-mapped type containing all properties from Idea entity and Count property.
public class IdeaProjection
{
public int Id { get; set; }
// Other properties
public int Count { get; set; }
}
Now you can use simple projection query to get everything with single request without loading any additional data:
var query = from x in context.Ideas
where ...
select new IdeaProjection
{
Id = x.Id,
// Mapped other properties
Count = x.ChildIdeas.Count()
};
Disadvantage is that IdeaProjection is not entity and if you want to use it for updates as well you must transform it back to Idea and tell EF about changes. From performance perspective it is best you can get from EF without reverting back to SQL or stored procedures.

How to prepare data for display on a silverlight chart using WCF RIA Services + Entity Framework

I've used WCF RIA services with Entity Framework to build a simple application which can display and updates data about school courses. This was done by following the Microsoft tutorials. Now I would like to have a chart which shows a count for how many courses are on a key stage.
Example:
Key Stage 3 - 20 courses
Key Stage 4 - 32 courses
Key Stage 5 - 12 courses
Displayed on any form of chart. I have no problem binding data to the chart in XAML. My problem is that I do not know how to correct way of getting the data into that format. The generated CRUD methods are basic.
I have a few thoughts about possible ways, but don't know which is correct, they are:
Create a View in SQL server and map this to a separate Entity in the Entity Data Model. Generating new CRUD methods for this automatically.
Customise the read method in the existing DomainService using .Select() .Distinct() etc. Don't know this syntax very well labda expressions/LINQ??? what is it? Any good quickstarts on it?
Create a new class to store only the data required and create a read method for it. Tried this but didn't know how to make it work without a matching entity in the entity model.
Something I am not aware of.
I'm very new to this and struggling with the concepts so if there are useful blogs or documentation I've missed feel free to point me towards them. But I'm unsure of the terminology to use in my searches at the moment.

One way to is to build a model class. A model is a class that represents the data you wish to display. For example i might have a table with 10 fields but i only need to display 2. Create a model with these two properties and return that from your data layer.
you can use entity framework to pump data into a new class like so
Model Class:
public class Kitteh
{
public string Name { get; set; }
public int Age { get; set; }
}
Entity Query:
public Iqueryable<Kitteh> getKittehz
{
var result = from x in Data.TblCats
select new Kitteh
{
Name = x.Name,
Age = x.Age
}
return result;
}
If you are interested in the best practices approach to building silverlight applications I would suggest you research the MVVM pattern.
http://www.silverlight.net/learn/videos/silverlight-4-videos/mvvm-introduction/
http://www.silverlight.net/learn/tutorials/silverlight-4/using-the-mvvm-pattern-in-silverlight-applications/

I am attempting a similar piece of work.
I will tell you the approach I am going to use and maybe that can help you.
I am going to create a class in the silverlight project to describe the chartItem: It will have 2 string properties : Key and Value.
Then create a collection object...In your case, this could be a class that has one property of type Dictionary<string,string> myCollection... or ObservableCollection<ChartItem> myCollection
The next step is to do a ForEach loop on the data coming back from the server and Add to your Collection.
myCollection.Add(new chartItem{ Key= "Key Stage 3", Value = "20 Courses" });
myCollection.Add(new chartItem{ Key= "Key Stage 4", Value = "60 Courses" });
myCollection.Add(new chartItem{ Key= "Key Stage 5", Value = "10 Courses" });
... more to follow if you are still looking for an answer

There is no easy way to include Views in Entity Framework as it does not allow any table/view to be included without "Key" (PrimaryKey) which will cause more efforts as you will have to map view manually in EDMX and then map keys etc.
Now we have found out an alternative approach,
Create View called ChartItems in your DB
Create LinqToSQL file ViewDB
Drag View ChartItems in ViewDB
Create ChartItem[] GetChartItems method in your RIA Domain Service Class as follow
public ChartItem[] GetChartItems(..parameters...){
ViewDB db = new ViewDB();
return db.ChartItems.Where(...query mapping...).ToArray();
}
RIA Domain Service Class can contain any arbitrary method that you can directly invoke from client with parameters. It is as simple as calling a web service. And you have to return an array because IQueryable may or may not work in some cases, but we prefer Array. You can try IQueryable but it may not work correctly against linq to SQL.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to avoid N+1 queries with Spring Data REST Projections? - spring-data-jpa

Related

Simple Tagging Implementation with Spring Data JPA/Rest

JPQL can not prevent cache for JPQL query

Spring Data JPA finder for dynamic fields as Map

Entity Framework, performance tuning

How to prepare data for display on a silverlight chart using WCF RIA Services + Entity Framework

Categories

Resources