Trying to Write JPQL For Complex Postgres Query - postgresql

I'm working on an Admin UI for an authorization server. One of the features is to display a list of who has logged in, and we are doing that by querying a database table where the currently issued refresh tokens are stored. A user can log in from multiple devices to the same application, generating multiple tokens. At the moment the requirements are NOT to break down this view by device, instead if a user has logged in at all, they are to be shown in the list. If we revoke access (one of the other requirements from this UI), then all devices will have their refresh tokens revoked.
Anyway, the main thing tripping me up is the query. I'm writing the query to pull back all the tokens for the specified user, but for each client only the most recent one is retrieved. ie, if there are 5 tokens for a given user/client combination, only the one with the most recent timestamp will be returned to the UI. I'm trying to do this entirely with JPQL in my SpringBoot/Hibernate backend, which is communicating with a Postgres database.
I can write this in SQL several different ways. Here are two forms of the query that return the same results:
select r1.*
from dev.refresh_tokens r1
join (
select r2.client_id, max(r2.timestamp) as timestamp
from dev.refresh_tokens r2
group by r2.client_id
) r3 on r1.client_id = r3.client_id and r1.timestamp = r3.timestamp
where r1.user_id = 1;
select r1.*
from dev.refresh_tokens r1
where r1.user_id = 1
and (r1.client_id, r1.timestamp) in (
select r2.client_id, max(r2.timestamp) as timestamp
from dev.refresh_tokens r2
group by r2.client_id
);
The reason I've figured out multiple ways to do the query is because I'm trying to also figure out how to translate it into JPQL. I avoid doing native queries in Hibernate as much as possible, instead relying on the DB-agnostic JPQL syntax. However, I just can't figure out how to translate this to JPQL.
I know native queries and/or putting filter logic into my Java code are both options. However, I'm hoping this is possible with a standard JPQL query.

You can use this:
select r1
from RefreshToken r1
where r1.user.id = 1
and r1.timestamp = (select max(r2.timestamp) from RefreshToken r2 where r2.user.id = r1.user.id);
Depending on your exact use case, I think this Blaze-Persistence Entity Views could come in handy here.
I created the library to allow easy mapping between JPA models and custom interface or abstract class defined models, something like Spring Data Projections on steroids. The idea is that you define your target structure(domain model) the way you like and map attributes(getters) via JPQL expressions to the entity model.
A DTO model for your use case could look like the following with Blaze-Persistence Entity-Views:
#EntityView(User.class)
public interface UserDto {
#IdMapping
Long getId();
String getName();
#Limit(limit = "1", order = "timestamp DESC")
#Mapping("tokens")
RefreshTokenDto getLatestToken();
#EntityView(RefreshToken.class)
interface RefreshTokenDto {
#IdMapping
Long getId();
String getToken();
}
}
Querying is a matter of applying the entity view to a query, the simplest being just a query by id.
UserDto a = entityViewManager.find(entityManager, UserDto.class, id);
The Spring Data integration allows you to use it almost like Spring Data Projections: https://persistence.blazebit.com/documentation/entity-view/manual/en_US/index.html#spring-data-features
It will use a lateral join query behind the scenes which is the most efficient on PostgreSQL:
select u1.id, u1.name, r1.id, r1.token
from dev.user u1
left join lateral (
select *
from dev.refresh_tokens r
where r.user_id = u1.id
order by r.timestamp desc
limit 1
) r1
where r1.user_id = 1

Related

Looking for the best option to limit JPA query volume

I have an entity model where I need to dig down two collections to get the data required on the screen - in Hibernate that triggered first an exception when I created a JPA join fetch query until I changed the collections to sets. It is still a cartesian product though. I am trying to write a JPARepository function to execute the following query now, but only with specific fields, because the resulting query returns more than 100 fields of which I need less than 10.
SELECT p FROM Person p
JOIN FETCH p.responsibilities r
JOIN FETCH r.configurations c
JOIN FETCH c.type
WHERE p.id = :id
My JPA Repository method is already using projection interfaces but I think because I am using the #Query annotation it does not optimize the query. When I tried it as a plain, typed findById() method with projection interfaces it did execute a load of subqueries which costs much more time.
Any recommendations

Join database query Vs Handeling Joins in API

I am developing an API where I am confused as to what is the efficient way to handle join query.
I want to join 2 tables data and return the response. Either I can query the database with join query and fetch the result and then return the response OR I can fire two separate queries and then I would handle the join in the API on the fly and return the response. Which is the efficient and correct way ?
Databases are pretty much faster than querying and joining as class instances. Always do joins in the database and map them from the code. Also look for any lazy loading if possible. Cause in a situation like below:
#Entity
#Table(name = "USER")
public class UserLazy implements Serializable {
#Id
#GeneratedValue
#Column(name = "USER_ID")
private Long userId;
#OneToMany(fetch = FetchType.LAZY, mappedBy = "user")
private Set<OrderDetail> orderDetail = new HashSet();
// standard setters and getters
// also override equals and hashcode
}
you might not want order details when you want the initial results.
Usually it's more efficient to do the join in the database, but there are some corner cases, mostly due to the fact that application CPU time is cheaper than database CPU time. Here are a few examples that come to mind, with a query like "table A join table B":
B is a small table that rarely changes.
In this case it can be profitable to cache the contents of this table in the application, and not query it at all.
Rows in A are quite large, and many rows of B are selected for each row of A.
This will cause useless network traffic and load as rows from A are duplicated many times in each result row.
Rows in B are quite large, and there are few distinct b_id's in A
Same as above, except this time the same few rows from B are duplicated in the result set.
In the previous two examples, it could be useful to perform the query on table A, then gather a set of unique b_id's from the result, and SELECT FROM b WHERE b_id IN (list).
Data structure and ORMs
If each table contains a different object type, and they have a "belongs to" relationship (like category and product) and you use an ORM which will instantiate objects for each selected row, then perhaps you only want one instance of each category, and not one per selected product. In this case, you could select the products, gather a list of unique category_ids, and select the categories from there. The ORM may even do that for you behind the scene.
Complicated aggregates
Sometimes, you want some stuff, and some aggregates of other stuff related to the first stuff, but it just won't fit in a neat GROUP BY, or you may need several ones.
So basically, usually the join works better in the database, so that should be the default. If you do it in the application, then you should know why you're doing it, and decide it's a good reason. If it is, then fine. I gave a few reasons, based on performance, data model, and SQL constraints, these are only examples of course.

Spring JPA Specification API with custom Query and custom response Object. Is this possible?

I have researched this for a few days but can't seem to find the right information.
Here is what I need, I have a Database, with multiple tables, I need to join a few tables together to make a sort of "search" API. I have to implement the ability to dynamically search fields (from various tables in the query), sortable, with pagination.
I have found that I cannot combine the #Query annotation with Specification API, and I looked into using the specification API to do the joins I needed but, the problem is the root must be one table/repository.
For example:
If I have a users table that has to join on addresses, phone_numbers, and preferences
the base repository will be UserResposiory and it will return the User entity model, but I need it to return a custom DTO
AccountUserDTO which contains fields from the User, Address, PhoneNumber, and Preference entities.
Would anyone know if this is possible at all??
I am at wits end here and I really want to build this the correct way.
Cheers!
You may do this way:
Build hql query as an string, depend on how the filter condition is requested, you can build the corresponding query, eg:
if (hasParam(searchName)) {
queryString = queryString + " myEntity.name = :queryName"
}
Query query = session.createQuery(queryString);
and the parameter providing
if (hasParam(searchName)) {
query.setParameter("queryName", searchName);
}
...
and execute it.
To create a customized object, the easiest way is treating the object as an array of field:
Query query = session.createQuery("select m.f1, m.f2, m.f3 from myTable m");
List managers = query.list();
Object[] manager = (Object[]) managers.get(0); //first row
System.out.println(manager[0]) //f1
System.out.println(manager[1]) //f2
System.out.println(manager[2]) //f3
There is also some other solution to select, such as
String query = "select new mypackage.myclass(m.f1, m.f2, m.f3) from myTable m";
-> And when execute the above query, it will return a list of object.
Or to be simpler, make your own view in db and map it to one entity.

Do canonical LINQ queries guard against N+1

With lazy loading used by default, I know that you should call .Include() on your Entity Framework entities to pull in associated entities you want in your queries to reduce the number of calls to the db if you're calling LINQ methods on your entities. If you don't, you run the risk of repeated database calls for each row (the N+1 problem)
Can someone confirm that if I write a canonical LINQ query, with the joins defined explicitly, that we guard against N+1?
from x in _context.tblOrder
join y in _context.tblCustomer equals y.id = x.customerId
select x
Is there any way N+1 could creep in when we're loading in all the required entities with joins?
EDIT
As background, someone asked how junior developers could guard against N+1. I mentioned the simplest way would be to write out your queries and define your joins, I want confirmation that was I indicated was 100% accurate.
If what you are really asking is
Will this query hit the database once?
Then the answer is yes. LINQ to EF translates your expression to raw SQL and only when you evaluate the query will it send anything to the database e.g. ToList()/foreach/for etc. and once that query is sent nothing else is unless you explicitly tell it otherwise.
Your LINQ statement could be simplified using a Lambda expression e.g.
_context.tblOrder.Include("Customer").ToList();
This would give you all the order details, including all related customer details, in a single database trip.
Just because you specify tables in a join doesn't mean that you can't run into a n+1 issue when you iterate over the values. Consider the following extension to your query:
var query = from o in Orders
join c in Customers on o.CustomerID equals c.CustomerID
select o;
foreach (var o in query)
{
Console.WriteLine(String.Format("{0}: {1}", o.OrderDate, o.Employee.FirstName));
}
In this case, each time you navigate through the order's Employee object, the employee is fetched from the database for that order. If you wanted to avoid the issue, you could project the values you want in the select clause:
var query = from o in Orders
join c in Customers on o.CustomerID equals c.CustomerID
select new {o.OrderDate, o.Employee.FirstName};
foreach (var o in query)
{
Console.WriteLine(String.Format("{0}: {1}", o.OrderDate, o.FirstName));
}
Note, in this case, you don't even need the join as you can just use the navigation properties instead. Of course, if you don't allow navigation properties in your entities and rely only on joins, you can avoid the n+1 situation, but that is not a very OOP way of solving the problem.
I think you would be safe guaranteeing against n+1 if you only return anonymous types from your queries, but that would be rather restrictive as well.
The best option is to make sure to profile your application's generated SQL and know precisely when and why you are hitting the database. I discuss some of the profilers available at http://www.thinqlinq.com/Post.aspx/Title/LINQ-to-Database-Performance-hints.

jpa lazy fetch entities over multiple levels with criteria api

I am using JPA2 with it's Criteria API to select my entities from the database. The implementation is OpenJPA on WebSphere Application Server. All my entities are modeled with Fetchtype=Lazy.
I select an entity with some criteria from the database and want to load all nested data from sub-tables at once.
If I have a datamodel where table A is joined oneToMany to table B, I can use a Fetch-clause in my criteria query:
CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<A> cq = cb.createQuery(A.class);
Root<A> root = cq.from(A.class);
Fetch<A,B> fetch = root.fetch(A_.elementsOfB, JoinType.LEFT);
This works fine. I get an element A and all of its elements of B are filled correctly.
Now table B has a oneToMany-relationship to table C and I want to load them too. So I add the following statement to my query:
Fetch<B,C> fetch2 = fetch.fetch(B_.elementsOfC, JoinType.LEFT);
But this wont do anything.
Does anybody know how to fetch multi level entities in one query?
It does not work with JPQL and there is no way to make it work in CriteriaQueries either. Specification limits fetched entities to the ones in that are referenced directly from the returned entity:
About fetch join with CriteriaQuery:
An association or attribute referenced by the fetch method must be
referenced from an entity or embeddable that is returned as the result
of the query.
About fetch join in JPQL:
The association referenced by the right side of the FETCH JOIN clause
must be an association or ele ment collection that is referenced from
an entity or embeddable that is returned as a result of the query.
Same limitation is also told in OpenJPA documentation.
For what is worth. I do this all the time and it works just fine.
Several points:
I'm using jpa 2.1, but I'm almost sure it used to work in jpa 2.0 as well.
I'm using the criteria api, and I know some things work diferent in jpql. So don't think it works some way or doesn't work because that's what happens in jpql. Most often they do behave in the same way, but not always.
(Also i'm using plain criteria api, no querydsl or anything. Sometimes it makes a difference)
My associations tend to be SINGULAR_ATTRIBUTE. So maybe that's the problem here. Try a test with the joins in reverse "c.fetch(b).fetch(a)" and see if that works. I know it's not the same, but just to see if it gives you any hint. I'm almost sure I have done it with onetomany left fetch joins too, though.
Yep. I just checked and found it: root.fetch("targets", LEFT).fetch("destinations", LEFT).fetch("internal", LEFT)
This has been working without problems for months, maybe more than a year.
I just run a test and it generates this query:
select -- all fields from all tables
...
from agreement a
left outer join target t on a.id = t.agreement_id
left outer join destination d on t.id = d.target_id
left outer join internal i on d.id = i.destination_id
And returns all rows with all associations with all fields.
Maybe the problem is a different thing. You just say "it wont do anyhting". I don't know if it throws an exception or what, but maybe it executes the query properly but doesn't return the rows you expect because of some conditions or something like that.
You could design a view in the DB joining tables b and c, create the entity and fetchit insted of the original entity.