Get max number of elements in a collection with jpql - jpa

I have the following two entities (1:N) :
#Entity
public class Job {
#ManyToOne
private User user
}
and
#Entity
public class User {
#OneToMany
private Collection<Job> jobs;
}
now i want to write a named Query with jpql which gets the User(s) with the most job(s).
With the following query on the Job Entity i am able to retrieve the number of jobs for each user... but somehow i have to compare it with the number of jobs of the user which has the maximum number of jobs of all...
#NamedQuery(query="SELECT j.user, COUNT(j) FROM Job j GROUP BY j.user" )
My second idea is to write the named query on the User entity:
#NamedQuery( query="SELECT u.username FROM User u WHERE SIZE(u.jobs) = MAX ??????")
Here also i don't know how to get the number of maximum assigned jobs....
can somebody help me out?

To list the user order by job size, you can use following JQL:
from User order by jobs.size desc
Which generated the following SQL using Hibernate with HSQLDB:
select
user0_.id as id0_
from
User user0_
order by
(select
count(jobs1_.user_id)
from
Job jobs1_
where
user0_.id=jobs1_.user_id) desc
To find the max job size of the users, you can limit the JPA to list only top user with the most jobs with using TypedQuery.setMaxResults(1) of the query which generated the follwing SQL for HSQLDB:
select
user0_.id as id0_
from
User user0_
order by
(select
count(jobs1_.user_id)
from
Job jobs1_
where
user0_.id=jobs1_.user_id) desc limit ?

I found a query similar to the following successfully gave me the list of Users in descending order of the number of jobs they have:
SELECT u FROM User u LEFT JOIN FETCH u.jobs
ORDER BY u.jobs.size DESC
Then if you just get the first result, then you will be getting the User with the maximum number of jobs. Is that what you wanted?
Alternatively, to get ALL the Users that have the maximum number of jobs you could use:
SELECT u FROM User u LEFT JOIN FETCH u.jobs
WHERE u.jobs.size = (SELECT max(jobs.size) FROM User)
You probably don't need the LEFT JOIN FETCH part

Related

PostgreSQL: How to check if a list is contained in another list?

I'm working with PostgreSQL 13.
I have two tables like this:
permission_table
name
permission
Ann
Read Invoice
Ann
Write Invoice
Ann
Execute Payments
Bob
Read Staff data
Bob
Modify Staff data
Bob
Execute Payroll
Carl
Read Invoice
Carl
Write Invoice
risk_table
risk_id
permission
Risk1
Read Invoice
Risk1
Write Invoice
Risk1
Execute Payments
Risk2
Read Staff data
Risk2
Modify Staff data
Risk2
Execute Payroll
I'd like to create a new table containing the names of the employees of the first table whose permissions are pointed as risks in the second table. After the execution, the results should be like this:
name
risk_id
Ann
Risk1
Bob
Risk2
Since Carl only has two of the three permissions belonging to Risk2, he will not be included in the results.
My first brute force approach was to compare the list of permissions belonging to a risk to the permissions belonging to an employee. If the first list is included in the second one, then that combination of employee/risk will be added to the results table.
INSERT INTO results_table
SELECT a.employee, b.risk_id FROM permission_table a, risk_table b WHERE
((SELECT permission FROM risk_table c WHERE b.permission = c.permission ) EXCEPT
(SELECT permission FROM permission_table d WHERE a.employee=d.employee)
) IS NULL;
I'm not sure if the results could be correct using this approach, because if the tables are big, it takes a very long time even if I add a WHERE clause limiting the query to just one employee.
Could you please help?
One way of approaching this one is by
computing the amount of permissions for each "risk_id" value
joining the "permissions" and "risks" table with counts on matching "permission" values
making sure that the distinct count of permissions for each triplet "<permissions.name, risks.risk_id, risks.cnt>" corresponds to the full amount of permissions.
WITH risks_with_counts AS (
SELECT *, COUNT(permission) OVER(PARTITION BY risk_id) AS cnt
FROM risks
)
SELECT p.name, r.risk_id
FROM permissions p
INNER JOIN risks_with_counts r
ON p.permission = r.permission
GROUP BY p.name, r.risk_id, r.cnt
HAVING COUNT(DISTINCT r.permission) = r.cnt
Carl won't be included in the output as he doesn't have all permissions from "risk_id = 'Risk 1'"
Check the demo here.

Rooms per user in matrix synapse database

How can I get the total number of matrix rooms a user is currently joined using the synapse postgres database? (excluding those rooms the user has left or been kicked, or been banned from)
I spent several hours looking for this, so I think maybe it can help others.
You can get the number of rooms a user is currently joined querying the table user_stats_current:
SELECT joined_rooms FROM user_stats_current WHERE user_id='#myuser:matrix.example.com';
And if you want to get specifically the ids of the rooms the user is currently joined, you can use the table current_state_events like in this query:
SELECT room_id FROM current_state_events
WHERE state_key = '#myuser:matrix.example.com'
AND type = 'm.room.member'
AND membership = 'join';
Even further, if you want not only the room id but the room name as well, you can add the table room_stats_state like in this other query:
SELECT e.room_id, r.name
FROM current_state_events e
JOIN room_stats_state r USING (room_id)
WHERE e.state_key = '#myuser:matrix.example.com'
AND e.type = 'm.room.member'
AND e.membership = 'join';

Retrieving inactive employees

I have the following query using the Invantive Query Tool connecting to NMBRS.
select e.number
, es.EmployeeId
, e.displayname
, es.ParttimePercentage
, es.startdate
from Nmbrs.Employees.EmployeeSchedules es
left
outer
join Nmbrs.Employees.Employees e
on es.EmployeeId = e.id
order
by e.displayname
, es.startdate
(I want to retrieve all mutations in part-time percentage/schedule)
However Nmbrs.Employees.Employees only shows active employees. And I need that because that shows the employee ID as shown in Nmbrs.Employees.EmployeeSchedules is not the employee ID shown in the UI rather it is an internal ID.
I did notice Nmbrs.Employees.Employees has an additional where clause (as per documentation):
Additional Where Clause:
- CompanyId
- active
The following query
select * from Nmbrs.Employees.Employees where active = 1
gives an error:
Unknown identifier 'active'.
Consider one of the following: Nmbrs.Employees.Employees.PartitionID, Nmbrs.Employees.Employees.Id, Nmbrs.Employees.Employees.Number, Nmbrs.Employees.Employees.DisplayName, Employees.Employees.PartitionID, Employees.PartitionID, PartitionID, Employees.Employees.Id.
Active isn't mentioned so I don't know if that is usable.
active is a server-side filter on Nmbrs.nl. It defaults to the value "active". Don't ask me why they choose to have an API reflect the user interface; it is weird, but it is the way it is.
To retrieve all employees from one or more companies (partitions), use:
use all
select * from employeesall
OR
select * from employeesinactive
These are recent additions to the Nmbrs.nl API tables supported.
Note that the output does NOT contain whether an employee is active. When you need that too, please use a view or:
select 'active' type
, t.*
from nmbrs..employeesactive t
union all
select 'inactive' type
, t.*
from nmbrs..employeesinactive t

OrientDB query for nodes connected to origin by multiple ways

For example, I have employee managing particular country and particular company. I want to query only accounts which are in countries AND companies managed by the given employee. Ideas? Performance issues to be aware of?
Gremlin query is acceptable, also!
This seems to work:
select from Account where
#rId in
(select expand(out('managingCountry').in('inCountry')).#rId
from Employee where userId = 3)
AND
#rId in
(select expand(out('managingCompany').in('inCompany')).#rId
from Employee where userId = 3)
Remains if someone has the better solution

How to count push events on GitHub using BigQuery?

I'm trying to use the public GitHub dataset on BigQuery to count events - PushEvents, in this case - on a per repository basis over time.
SELECT COUNT(*)
FROM [githubarchive:github.timeline]
WHERE type = 'PushEvent'
AND repository_name = "account/repo"
GROUP BY pushed_at
ORDER BY pushed_at DESC
Basically just retrieve the count for a specified repo and event type, group the count by date and return the list. BigQuery validates the following, but then fails the query with a:
Field 'pushed_at' not found.
As far as I can tell from GitHub's PushEvent documentation, however, pushed_at is an available field. Anybody have examples of related queries that execute properly? Any suggestions as to what's being done incorrectly here?
The field is called repository_pushed_at, and you also probably meant to include it in the SELECT list, i.e.
SELECT repository_pushed_at, COUNT(*)
FROM [githubarchive:github.timeline]
WHERE type = 'PushEvent'
AND repository_name = "account/repo"
GROUP BY repository_pushed_at
ORDER BY repository_pushed_at DESC