Using array_agg with multiple DISTINCT Columns

Using array_agg with multiple DISTINCT Columns - postgresql

In this query, I'm listing all users in organization 123 but I also want a column showing which other teams they are on across all organizations.
My query right now will give me the team names but I'd also like to get the team id as well. The DISTINCT is necessary because they user may have different roles on the same team.
Bonus points if I can sort the teams by when the user was given a role, which currently gives an error as I have it now.
SELECT
users.*,
(
SELECT
to_json(array_agg(DISTINCT teams.name ORDER BY teams.name))
FROM roles r
INNER JOIN user_roles ur ON ur.role_id=r.id AND ur.user_id=users.id
INNER JOIN teams ON r.team_id=teams.id
-- ORDER BY r.created_at
) teams
FROM users
INNER JOIN user_roles ON users_roles.user_id=users.id
INNER JOIN roles ON roles.id = user_roles.role_id
WHERE roles.type = 'admin' AND roles.organization_id = 123
GROUP BY users.id
This returns:
name | teams
John Smith | ['Team 1', 'Team 2']
Jane Doe | ['Team 2', 'Team 3']
What I'd like to return is the team name with its primary key id:
name | teams
John Smith | {1: 'Team 1', 2: 'Team 2'}
Jane Doe | {2: 'Team 2', 3: 'Team 3'}
EDIT
Or better yet:
name | teams
John Smith | [{id: 1, name: 'Team 1'}, {id: 2, 'Team 2'}]
Jane Doe | [{id: 2, name: 'Team 2'}, {id: 3, 'Team 3'}]

Considering that your query is working fine, replace the following section of your query mentioned in question:
SELECT
to_json(array_agg(DISTINCT teams.name ORDER BY teams.name))
FROM roles r
INNER JOIN user_roles ur ON ur.role_id=r.id AND ur.user_id=users.id
INNER JOIN teams ON r.team_id=teams.id
-- ORDER BY r.created_at
with
(SELECT
json_object(array_agg(id::text order by created_at desc),
array_agg(name order by created_at desc)) from
( SELECT
DISTINCT on (teams.id) teams.id, teams.name , r.created_at
FROM roles r
INNER JOIN user_roles ur ON ur.role_id=r.id AND ur.user_id=users.id
INNER JOIN teams ON r.team_id=teams.id
ORDER BY r.created_at
)tab)

Here is how I ended up solving it, building off of #akhilesh answer.
SELECT
json_object(
array_agg(id :: text ORDER BY created_at DESC),
array_agg(name ORDER BY created_at DESC)
)
FROM
(
SELECT
*
FROM
(
SELECT
teams.id,
teams.name,
MAX(ur.created_at) created_at
FROM
roles r
INNER JOIN user_roles ur ON ur.role_id = r.id AND ur.user_id = users.id
INNER JOIN teams ON r.team_id=teams.id
GROUP BY
teams.id
) T
ORDER BY
T.created_at DESC
) teams

Related

Query to return multiple MAX values with HAVING clause

I want to write a query that will return the name of students who did the most projects with the count of the project. I want the query to return a table like this:
student_name
max_project_count
John Doe
2
Anna Do
2
This is the code I have so far but it's only giving me the 2 column names student_name and count, but not the result.
SELECT s.student_name, COUNT(student_name)
FROM student s
GROUP BY student_name
HAVING COUNT(student_name) = (
SELECT MAX(count)
FROM (SELECT s.student_name, COUNT(*) AS count
FROM student_project k, student s
WHERE s.student_id = k.student_id
GROUP BY student_name) AS foo)
Result I have right now:
student_name
max_project_count
These are the tables I have in my database:
student
student_id
student_name
jd123
John Doe
ad456
Anna Do
js678
Jess Smith
dk789
Daniel Kim
school_project
project_id
project_name
math_1023
Math Comp.
sci_9872
Science Comp.
student_project
student_id
project_id
jd123
math_1023
ad456
math_1023
jd123
sci_9872
ad456
sci_9872
js678
sci_9872
dk789
sci_9872

with projects as (
Select student_id, count(*) as pcount from student_project group by 1),
max_proj as (
Select max(pcount) as max_project_count from projects)
Select
student_name, max_project_count
from student s,projects p,max_proj m
where
s.student_id=p.student_id and pcount=max_project_count

Find Missing Values Between 3 Tables

I have 3 tables: Permissions, Roles, and RolePermissions. I would like to have a way to select Roles that are missing new rows in the Permissions table based on the RolePermissions table relationship to insert those values once new permissions are added.
I have had no luck finding how this can be done so that is why I'm asking here.
Table structure
Permissions | Roles | RolePermissions
------------------------------------------
Id | Id | Id
Name | Name | RoleId
| | PermissionId
Idea of sql but I know it's not correct:
-- Looking to be able to do something like
INSERT INTO RolePermissions (RoleId, PermissionId)
SELECT missingpermissions.PermissionId, missingpermissions.RoleId
FROM Permissions as p
INNER JOIN(
Select r.Id as RoleId, p.Id as PermissionId
FROM Role as r
LEFT JOIN RolePermissions as rp
ON r.Id = rp.RoleId
WHERE rp.PermissionId = p.Id
) as missingpermissions
ON p.id = missingpermissions.permissionid
Edited to format

You need to get your new permission and cross join all roles (to get all combinations of roles and new permissions).
INSERT INTO RolePermissions(RoleId, PermissionId)
SELECT r.ID AS RoleId,p.ID AS PermissionId
FROM Role r
CROSS JOIN (
--get all permissions currently not assigned to a role (presumably "new")
select p.*
from Permissions p
left join RolePermissions rp on p.id=rp.PermissionId
where rp.PermissionId is null
) p

Cascading sum hierarchy using recursive cte

I'm trying to perform recursive cte with postgres but I can't wrap my head around it. In terms of performance issue there are only 50 items in TABLE 1 so this shouldn't be an issue.
TABLE 1 (expense):
id | parent_id | name
------------------------------
1 | null | A
2 | null | B
3 | 1 | C
4 | 1 | D
TABLE 2 (expense_amount):
ref_id | amount
-------------------------------
3 | 500
4 | 200
Expected Result:
id, name, amount
-------------------------------
1 | A | 700
2 | B | 0
3 | C | 500
4 | D | 200
Query
WITH RECURSIVE cte AS (
SELECT
expenses.id,
name,
parent_id,
expense_amount.total
FROM expenses
WHERE expenses.parent_id IS NULL
LEFT JOIN expense_amount ON expense_amount.expense_id = expenses.id
UNION ALL
SELECT
expenses.id,
expenses.name,
expenses.parent_id,
expense_amount.total
FROM cte
JOIN expenses ON expenses.parent_id = cte.id
LEFT JOIN expense_amount ON expense_amount.expense_id = expenses.id
)
SELECT
id,
SUM(amount)
FROM cte
GROUP BY 1
ORDER BY 1
Results
id | sum
--------------------
1 | null
2 | null
3 | 500
4 | 200

You can do a conditional sum() for only the root row:
with recursive tree as (
select id, parent_id, name, id as root_id
from expense
where parent_id is null
union all
select c.id, c.parent_id, c.name, p.root_id
from expense c
join tree p on c.parent_id = p.id
)
select e.id,
e.name,
e.root_id,
case
when e.id = e.root_id then sum(ea.amount) over (partition by root_id)
else amount
end as amount
from tree e
left join expense_amount ea on e.id = ea.ref_id
order by id;
I prefer doing the recursive part first, then join the related tables to the result of the recursive query, but you could do the join to the expense_amount also inside the CTE.
Online example: http://rextester.com/TGQUX53703
However, the above only aggregates on the top-level parent, not for any intermediate non-leaf rows.
If you want to see intermediate aggregates as well, this gets a bit more complicated (and is probably not very scalable for large results, but you said your tables aren't that big)
with recursive tree as (
select id, parent_id, name, 1 as level, concat('/', id) as path, null::numeric as amount
from expense
where parent_id is null
union all
select c.id, c.parent_id, c.name, p.level + 1, concat(p.path, '/', c.id), ea.amount
from expense c
join tree p on c.parent_id = p.id
left join expense_amount ea on ea.ref_id = c.id
)
select e.id,
lpad(' ', (e.level - 1) * 2, ' ')||e.name as name,
e.amount as element_amount,
(select sum(amount)
from tree t
where t.path like e.path||'%') as sub_tree_amount,
e.path
from tree e
order by path;
Online example: http://rextester.com/MCE96740
The query builds up a path of all IDs belonging to a (sub)tree and then uses a scalar sub-select to get all child rows belonging to a node. That sub-select is what will make this quite slow as soon as the result of the recursive query can't be kept in memory.
I used the level column to create a "visual" display of the tree structure - this helps me debugging the statement and understanding the result better. If you need the real name of an element in your program you would obviously only use e.name instead of pre-pending it with blanks.

I could not get your query to work for some reason. Here's my attempt that works for the particular table you provided (parent-child, no grandchild) without recursion. SQL Fiddle
--- step 1: get parent-child data together
with parent_child as(
select t.*, amount
from
(select e.id, f.name as name,
coalesce(f.name, e.name) as pname
from expense e
left join expense f
on e.parent_id = f.id) t
left join expense_amount ea
on ea.ref_id = t.id
)
--- final step is to group by id, name
select id, pname, sum(amount)
from
(-- step 2: group by parent name and find corresponding amount
-- returns A, B
select e.id, t.pname, t.amount
from expense e
join (select pname, sum(amount) as amount
from parent_child
group by 1) t
on t.pname = e.name
-- step 3: to get C, D we union and get corresponding columns
-- results in all rows and corresponding value
union
select id, name, amount
from expense e
left join expense_amount ea
on e.id = ea.ref_id
) t
group by 1, 2
order by 1;

Getting array aggregate of all modes for a group by result

I have a bunch of ~600k rows of let's say owner's names (varchar) and pet type (also varchar). For each owner's name I'd like an array with the most frequent pet they have (or pets if they have an equal amount of the same pet type).
An example:
*owner, pet type*
alice, cat
alice, dog
bob, fish
bob, cat
bob, fish
eve, cat
eve, dog
eve, cat
eve, dog
Expected output:
alice, [cat, dog]
bob, [fish]
eve, [cat, dog]
My feeling is that this is some combination of 'distinct on' in an inner query with array_agg on an outer query to do the array aggregation - but I just can't get it right.

You can do this by combining window functions and grouping:
select owner, array_agg(pet order by pet)
from (
select owner, pet, dense_rank() over (partition by owner order by count(*) desc) as rnk
from pet
group by owner, pet
) t
where rnk = 1
group by owner
order by owner;
Online example: http://rextester.com/MTFIQ24341

with data as (
select 'alice' as owner, 'cat' pet_type
union all select 'alice' as owner, 'dog' pet_type
union all select 'bob' as owner, 'fish' pet_type
union all select 'bob' as owner, 'cat' pet_type
union all select 'bob' as owner, 'fish' pet_type
union all select 'eve' as owner, 'cat' pet_type
union all select 'eve' as owner, 'dog' pet_type
union all select 'eve' as owner, 'cat' pet_type
union all select 'eve' as owner, 'dog' pet_type
) , getMaxPet as (select owner , pet_type
from data d1
group by owner,pet_type
having count(pet_type) = (select max(pet_count) from (select count(pet_type) as pet_count
from data d2
where
d1.owner = d2.owner
group by owner,pet_type ) a ) )
select owner , array_agg(pet_type)
from getMaxPet
group by owner
Try this, Main logic is to find all pets counts based on each user and then selects pet who is having max number.

Top Group BY Problem DB2 [duplicate]

I've been trying for hours but can't get the query to do what I want using DB2.
From table Company and Users I have the following tickets quantity info per company/user
user company quantity
------------ ------------ ------------
mark nissan 300
tom toyota 50
steve krysler 80
mark ford 20
tom toyota 120
jose toyota 230
tom nissan 145
steve toyota 10
jose krysler 35
steve ford 100
This is generated by the query:
SELECT T.USER, COUNT(T.USER) AS QUANTITY, T.COMPANY FROM TICKET T
INNER JOIN COMPANY P ON P.COMPANY = T.COMPANY
GROUP BY (T.USER, T.COMPANY) -- ORDER BY QUANTITY DESC
What I want to see is the top user for each company, so given the data above, the query should show me:
user company quantity (Top user per company)
------------ ------------ --------------------------------
mark nissan 300
jose toyota 230
steve ford 100
steve krysler 80
How can I write the SQL to return this result?
Final answer (noted in a comment):
SELECT user, quantity, company
FROM (SELECT user, quantity, company,
RANK () OVER (PARTITION BY company ORDER BY quantity DESC) AS r
FROM (SELECT T.USER, COUNT(T.USER) AS QUANTITY, T.COMPANY
FROM TICKET T JOIN COMPANY P ON P.COMPANY = T.COMPANY
GROUP BY (T.USER, T.COMPANY) ) s ) t
WHERE r = 1;

Build it up step by step.
Find the maximum quantity for each company, assuming the first data table shown in the question is called 'Tickets':
SELECT Company, MAX(Quantity) AS MaxQuantity
FROM Tickets
GROUP BY Company;
Now, find the data for the user(s) with that maximum quantity for that company:
SELECT T.User, T.Company, M.MaxQuantity
FROM Tickets AS T
JOIN (SELECT Company, MAX(Quantity) AS MaxQuantity
FROM Tickets
GROUP BY Company) AS M
ON T.Company = M.Company AND T.Quantity = M.MaxQuantity;
If the top quantity for a particular company was, say, 200 and two users both scored 200 for that company, then this query lists both users.
Now, if you mean that the query you show in the question generates the first result table, then what I called tickets just above needs to be the derived table:
SELECT T.User, COUNT(T.User) AS Quantity, T.Ccompany
FROM Ticket AS T
INNER JOIN Company AS P ON P.Company = T.Company
GROUP BY (T.User, T.Company)
ORDER BY QUANTITY DESC
In which case, we can use a WITH clause (syntax unchecked, but I think it is correct per SQL standard):
WITH Tickets AS
(SELECT T.User, COUNT(T.User) AS Quantity, T.Ccompany
FROM Ticket AS T
JOIN Company AS P ON P.Company = T.Company
GROUP BY (T.User, T.Company)
)
SELECT T.User, T.Company, M.MaxQuantity
FROM Tickets AS T
JOIN (SELECT Company, MAX(Quantity) AS MaxQuantity
FROM Tickets
GROUP BY Company) AS M
ON T.Company = M.Company AND T.Quantity = M.MaxQuantity;
Clearly, you can also write the WITH sub-query out twice if you prefer.

This should work. Create a derived view to calculate the Quantity per user and per company. Then get the max of then Quantity and then join the max back to the the calculation of the quantity.
SELECT p.company,
t.user,
t.quantity
FROM (SELECT MAX(t.quantity) max_quantity,
t.company
FROM (SELECT
COUNT(t.user) quantity,
t.company
FROM ticket t
GROUP BY t.company) t) maxq
INNER JOIN (SELECT t.user,
t.company,
COUNT(t.user) quantity
FROM ticket t
GROUP BY t.company,
t.user) t
ON maxq.max_quantity = t.quantity
AND maxq.company = t.company
INNER JOIN company p
ON p.company = t.company
ORDER BY t.quantity DESC
A working sample that shows the top users by tag for the StackOverflow data can be found here.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Using array_agg with multiple DISTINCT Columns - postgresql

Related

Query to return multiple MAX values with HAVING clause

Find Missing Values Between 3 Tables

Cascading sum hierarchy using recursive cte

Getting array aggregate of all modes for a group by result

Top Group BY Problem DB2 [duplicate]

Categories

Resources