How to achieve pagination without writing Query in JPA? - jpa

I have a Question object which has List of Comment objects with #OneToMany mapping. The Question object has a fetchComments(int offset, int pageSize) method to fetch comments for a given question.
I want to paginate the comments by fetching a limited amount of them at a time.
If I write a Query object then I can set record offset and maximum records to fetch with Query.setFirstResult(int offset) and Query.setMaxResults(int numberOfResults). But my question is how(if possible) can I achieve the same result without having to write a Query i.e. with simple annotation or property. More clearly, I need to know if there is something like
#OneToMany(cascade = CascadeType.ALL)
#Paginate(offset = x,maxresult = y)//is this kind of annotation available?
private List<Comment> comments;
I have read that #Basic(fetch = FetchType.LAZY) only loads the records needed at runtime, but I won't have control to the number of records fetched there.
I'm new to JPA. So please consider if I've missed something really simple.

No, there is no such a functionality in JPA. Also concept itself is bit confusing. With your example offset (and maxresult as well) is compile time constant and that does not serve pagination purpose too well. Also in general JPA annotations in entities define structure, not the context dependent result (for that need there is queries).
If fetching entities when they are accessed in list is enough and if you are using Hibernate, then closest you can get is extra #LazyCollection:
#org.hibernate.annotations.LazyCollection(LazyCollectionOption.EXTRA)

Related

how to read complex data from multiple related database tables in spring batch

I intend to implement the batch to read data from various DB tables to populate the complex domain below, then perform calculation in processor and load the data into DB via writer.
public class A{
private String id;
private String name;
private ArrayList list1;
private ArrayList list2;
......
}
Now, I am stuck at the design of the reader. The idea is to query DB table to get a list of id, then query the other fields including list1 and list 2 based on each id. It seems the existing reader can not fulfill this requirement, do I need to create custom reader to achieve the goal? I think I would take chunk approach, but has no clue how to implement it.
Code example is much appreciated.
You can use the driving query pattern. The reader reads only the IDs and then the processor can query the details of each object based on the ID.
This is a common pattern and you can find more details about it in the common batch patterns section of the documentation here: https://docs.spring.io/spring-batch/4.0.x/reference/html/common-patterns.html#drivingQueryBasedItemReaders

JPA Lazy Fetch Custom Query

I am using JPA/JFreeChart to display data I collected with a microcontroller, however, I measure 14 sensors every 10 seconds. I have been measuring for over 2 months and I have over 7000000 sets of data.
Now to my actual problem, since I don't want to load 7000000 rows every time I start my program, I only want to use average values by minutes/hours. I have thought of using a NamedQuery however I don't know how to keep the relationship within it and make JPA use it since up until now the loading of the data has been done by JPA itself. Maybe I can just solve this by adding more annotations to this?
#OneToMany(mappedBy="sensor")
#OrderBy("timestamp ASC")
public List<Value> getValues() {
return this.values;
}
Thanks in advance!
Best Regards
Straight JPA does not allow filtering results, since this means that the entity's relationship no longer reflects exactly what is in the database, and it would have to standardize behavior on what is done when adding an entity to the relationship that isn't in the collection, but already exists in the database.
The easiest way for this mapping though would be to mark the attribute as #Transient. You can then use the get method to read the values from the database using when needed, and cache them in the entity if you want.
Many providers do allow adding filters to the queries used to bring in mappings, for instance EclipseLink allows setting #AdditionalCriteria on the mapping as described here: http://wiki.eclipse.org/EclipseLink/Development/AdditionalCriteria Or you can modify the mapping directly as shown here: http://wiki.eclipse.org/EclipseLink/Examples/JPA/MappingSelectionCriteria

set batch size for a lazy loaded collection in spring-data-jpa

I need to generate a CSV file containing a database export. Since the data can be quite big, I want to use lazy loading with a specific batch size, so that when I iterate through the collection returned by the DAO/Repository, I will only have a batch loaded at one point. I want this to be done automatically by the collection (e.g. otherwise I could just load page after page, using Pageable as a parameter).
Here is some code to hopefully make things clearer. My controller looks something like this:
public ModelAndView generateCsv(Status status) {
//can return a large number of items.
Collection<Item> items = itemRepository.findByStatus(status);
return new ModelAndView("csv", "items", items);
}
As you can see, I'm passing that collection to the view (through the ModelAndView object), and the view will just iterate through it and generate the CSV.
I want the collection to know how to load the next batch internally, which is what a lazy loaded collection should generally do.
Is there a way to do this with Spring-data or just plain JPA?
I know of ScrollableResults from Hibernate, but I don't like it for two reasons: it's not JPA (I'd have to make my code depend on Hibernate), and it's not using collections API, thus I'd have to make my view know about ScrollableResults. At least if it would implement Iterable, that would have made it nicer, but it's not.
So what I'm looking for is a way to specify that a collection is to be lazy loaded, using a specific batch size. Maybe something like:
#Query("SELECT o FROM Item o WHERE o.status = ?1")
#Fetch(type = FetchType.LAZY, size = 100)
Page<Item> findByStatus(Item.Status status);
If something like this is not possible using Spring Data, do you know if it can be done with QueryDsl? The fact that QueryDsl repositories return Iterator objects makes me think it might lazy load those, though I can't find documentation on that.
Thanks,
Stef.

Refactoring application: Direct database access -> access through REST

we have a huge database application, which must get refactored (there are so many reasons for this. biggest one: security).
What we already have:
MySQL Database
JPA2 (Eclipselink) classes for over 100 tables
Client application that accesses the database directly
What needs to be there:
REST interface
Login/Logout with roles via database
What I've done so far:
Set up Spring MVC 3.2.1 with Spring Security 3.1.1
Using a custom UserDetailsService (contains just static data for testing atm)
Created a few Controllers for testing (simply receiving/providing data)
Design Problems:
We have maaaaany #OneToMany and #ManyToMany relations in our database
1.: (important)
If I'd send the whole object tree with all child objects as a response, I could probably send the whole database at once.
So I need a way to request for example 'all Articles'. But it should omit all the child objects. I've tried this yesterday and the objects I received were tons of megabytes:
#PersistenceContext
private EntityManager em;
#RequestMapping(method=RequestMethod.GET)
public #ResponseBody List<Article> index() {
List<Article> a = em.createQuery("SELECT a FROM Article a", Article.class).getResultList();
return a;
}
2.: (important)
If the client receives an Article, at the moment we can simply call article.getAuthor() and JPA will do a SELECT a FROM Author a JOIN Article ar WHERE ar.author_id = ?.
With REST we could make a request to /authors/{id}. But: This way we can't use our old JPA models on the client side, because the model contains Author author and not Long author_id.
Do we have to rewrite every model or is there a simpler approach?
3.: (less important)
Authentication: Make it stateless or not? I've never worked with stateless auth so far, but Spring seems to have some kind of support for it. When I look at some sample implementations on the web I have security concerns: With every request they send username and password. This can't be the right way.
If someone knows a nice solution for that, please tell me. Else I'd just go with standard HTTP Sessions.
4.:
What's the best way to design the client side model?
public class Book {
int id;
List<Author> authors; //option1
List<Integer> authorIds; //option2
Map<Integer, Author> idAuthorMap; //option3
}
(This is a Book which has multiple authors). All three options have different pros and cons:
I could directly access the corresponding Author model, but if I request a Book model via REST, I maybe don't want the model now, but later. So option 2 would be better:
I could request a Book model directly via REST. And use the authorIds to afterwards fetch the corresponding author(s). But now I can't simply use myBook.getAuthors().
This is a mixture of 1. and 2.: If I just request the Books with only the Author ids included, I could do something like: idAuthorMap.put(authorId, null).
But maybe there's a Java library that handles all the stuff for me?!
That's it for now. Thank you guys :)
The maybe solution(s):
Problem: Select only the data I need. This means more or less to ignore every #ManyToMany, #OneToMany, #ManyToOne relations.
Solution: Use #JsonIgnore and/or #JsonIgnoreProperties.
Problem: Every ignored relation should get fetched easily without modifying the data model.
Solution: Example models:
class Book {
int bId;
Author author; // has #ManyToOne
}
class Author {
int aId;
List<Book> books; // has #OneToMany
}
Now I can fetch a book via REST: GET /books/4 and the result will look like that ('cause I ignore all relations via #JsonIgnore): {"bId":4}
Then I have to create another route to receive the related author: GET /books/4/author. Will return: {"aId":6}.
Backwards: GET /authors/6/books -> [{"bId":4},{"bId":42}].
There will be a route for every #ManyToMany, #OneToMany, #ManyToOne, but nothing more. So this will not exist: GET /authors/6/books/42. The client should use GET /books/42.
First, you will want to control how the JPA layer handles your relationships. What I mean is using Lazy Loading vs. Eager loading. This can easily be controller via the "fetch" option on the annotation like thus:
#OneToMany(fetch=FetchType.Lazy)
What this tells JPA is that, for this related object, only load it when some code requests it. Behind the scenes, what is happening is that a dynamic "proxy" object is being made/created. When you try to access this proxy, it's smart enough to go out and do another SQL to gather that needed bit. In the case of Collection, its even smart enough to grab the underlying objects in batches are you iterate over the items in the Collection. But, be warned: access to these proxies has to happen all within the same general Session. The underlying ORM framework (don't know how Eclipselink works...I am a Hybernate user) will not know how to associate the sub-requests with the proper domain object. This has a bigger effect when you use transportation frameworks like Flex BlazeDS, which tries to marshal objects using bytecode instead of the interface, and usually gets tripped up when it sees these proxy objects.
You may also want to set your cascade policy, which can be done via the "cascade" option like
#OneToMany(cascade=CascadeType.ALL)
Or you can give it a list like:
#OneToMany(cascade={CascadeType.MERGE, CascadeType.REMOVE})
Once you control what is getting pulled from your database, then you need to look at how you are marshalling your domain objects. Are you sending this via JSON, XML, a mixture depending on the request? What frameworks are you using (Jackson, FlexJSON, XStream, something else)? The problem is, even if you set the fetch type to Lazy, these frameworks will still go after the related objects, thus negating all the work you did telling it to lazily load. This is where things get more specific to the mashalling/serializing scheme: you will need to figure out how to tell your framework what to marshal and what not to marshal. Again, this will be highly dependent on whatever framework is in use.

JPA Equivalent of Oracle TopLink's addBatchReadAttribute

We're using JPA, and when a collection of objects returns from a query, a separate query is executed for each "child" object related through a foreign key.
For example, in our Authorization entity class, we have the following Client object mapped:
#JoinColumn(name = "clientId", referencedColumnName = "clientId")
#ManyToOne (fetch = FetchType.LAZY)
#NotNull(groups = Default.class)
private Client client;
When 10 Authorizations are returned, 10 Client queries are executed. In TopLink, I was able to bring this number to one with the ReadAllQuery class's addBatchReadAttribute() method. According to the TopLink docs, "when any of the batched parts is accessed, the parts will all be read in a single query, this allows all of the data required for the parts to be read in a single query instead of (n) queries."
This worked perfectly, giving us a single query using an IN clause with 10 ids.
What I read about re: JPA pointed me toward a batch join or something like:
hints = {#QueryHint(name = "eclipselink.batch", value = "p.client"), ...
This strategy helps reduce the number of queries,
but it also gave me more joins, possibly slowing things down (?) on
some queries
and it didn't seem to help as drastically as the TopLink call.
Is there a way to get the strategy that uses a single query with IN in the WHERE clause?
Thanks in advance.
Dave
Internally the QueryHint "eclipselink.batch" is translated to addBatchAttribute() so the behaviour you see should be identical. Does the JPQL you have created produce the same query as the native TopLink API? It is possible you have Fetch's or additional joins in the JPQL?
EclipseLink supports several types of batch fetching.
See,
http://java-persistence-performance.blogspot.com/2010/08/batch-fetching-optimizing-object-graph.html