Postgres json_agg Limit - postgresql

I'm using json_agg in Postgres like this
json_agg((e."name",e."someOtherColum",e."createdAt") order by e."createdAt" DESC )
But I want to limit how many rows will be aggregated into JSON. I want to write something like this
json_agg((e."name",e."someOtherColum",e."createdAt") order by e."createdAt" DESC LIMIT 3)
Is it possible in some way?
This is full query
SELECT e."departmentId",
json_agg((e."name",e."someOtherColum",e."createdAt") order by e."createdAt" DESC ) as "employeeJSON"
FROM "Employee" e
GROUP BY e."departmentId"
So I want to achieve department with first three employees for each department.

You need a sub-select that returns only three rows per departmentid and then aggregate the result of that:
select "departmentId",
json_agg(("name","someOtherColum","createdAt") order by "createdAt" DESC) as "employeeJSON"
FROM (
SELECT "departmentId",
"name"
"someOtherColum",
"createdAt",
row_number() over (partition by "departmentId" order by "createdAt") as rn
FROM "Employee"
) t
WHERE rn <= 3
GROUP BY "departmentId"
Note that using quoted identifiers is in general not such a good idea. In the long run it's more trouble than they are worth it.

Related

Reset increment in PostgreSQL

I just started learning Postgres, and I'm trying to make an aggregation table that has the columns:
user_id
booking_sequence
booking_created_time
booking_paid_time
booking_price_amount
total_spent
All columns are provided, except for the booking_sequence column. I need to make a query that shows the first five flights of each user that has at least x purchases and has spent more than a certain amount of money, then sort it by the amount of money spent by the user, and then sort it by the booking sequence column.
I've tried :
select user_id,
row_number() over(partition by user_id order by user_id) as booking_sequence,
booking_created_time as booking_created_date,
booking_price_amount,
sum(booking_price_amount) as total_booking_price_amount
from fact_flight_sales
group by user_id, booking_created_time, booking_price_amount
having count(user_id) > 5
and total_booking_price_amount > 1000
order by total_booking_price_amount;
I got 0 when I added count(user_id) > 5, and total_booking_price_amount is not found when I add the second condition in the HAVING clause.
Edit:
I managed to make the code function correctly, for those who are curious:
select x.user_id, row_number() over(partition by x.user_id)
as booking_sequence, x.booking_created_time::date as booking_created_date, x.booking_price_amount,
sum(y.booking_price_amount) as total_booking_price_amount from
(
select user_id, booking_created_time, booking_price_amount from fact_flight_sales
group by user_id, booking_created_time, booking_price_amount
) as x
join
(
select user_id, booking_price_amount
from fact_flight_sales group by user_id, booking_price_amount
) as y
on x.user_id = y.user_id
group by x.user_id, x.booking_created_time, x.booking_price_amount
having count(x.user_id) >= 1 and sum(y.booking_price_amount) >250000
order by total_booking_price_amount desc, booking_sequence asc;
Big thanks to Laurenz for the help!
About count(user_id) > 5:
HAVING is calculated before window functions are evaluated, So result rows excluded by the HAVING clause will not be used to calculate the window function.
About total_booking_price_amount in HAVING:
You cannot use aliases from the SELECT list in the HAVING clause. You will have to repeat the expression (or use a subquery).

PostgreSQL command : using the result obtained from first Query and using it In second Query : write as single query

SELECT partner_id
FROM trip_delivery_sales ts
WHERE ts.route_id='152'
GROUP BY ts.partner_id
From the query we can get the partners id.Using that partner id we want check in trip delicery sales lines table and want to find each customer last two sale product quantity sum. If last two sale have product qty as 2 & 5 want result as partner_id | count as Mn2333 - 7
here fore example i take partner id as 34806. But i want to check all partner_id obtained from last query
SELECT product_qty
FROM trip_delivery_sales_lines td
WHERE td.partner_id='34806'
AND td.route_id='152'
AND td.product_id='432'
ORDER BY td.order_date DESC
LIMIT 2
You can run this query
SELECT td.partner_id,sum(product_qty)
FROM trip_delivery_sales_lines td,
(SELECT partner_id FROM trip_delivery_sales ts WHERE ts.route_id='152') as ts
WHERE td.partner_id=ts.partner_id
AND td.product_id='432'
GROUP BY td.partner_id
ORDER BY td.order_date DESC
LIMIT 2
Or this one
with ts as (SELECT distinct partner_id FROM trip_delivery_sales WHERE route_id='152')
SELECT td.partner_id,sum(product_qty)
FROM trip_delivery_sales_lines td,ts
WHERE td.partner_id=ts.partner_id
AND td.product_id='432'
GROUP BY td.partner_id
ORDER BY td.order_date DESC
LIMIT 2
You might be looking for
SELECT DISTINCT ts.partner_id, ARRAY(
SELECT product_qty
FROM trip_delivery_sales_lines td
WHERE td.partner_id=ts.partner_id
AND td.product_id='432'
ORDER BY td.order_date DESC
LIMIT 2
) AS product_qty_arr
FROM trip_delivery_sales ts
WHERE ts.route_id='152'
or just
SELECT
partner_id,
array_agg(product_qty ORDER BY order_date DESC) as product_qty_arr
FROM (
SELECT
td.partner_id,
td.product_qty,
td.order_date,
row_number() OVER (PARTITION BY td.partner_id ORDER BY td.order_date DESC)
FROM trip_delivery_sales_lines td
JOIN trip_delivery_sales ts USING (partner_id)
WHERE ts.route_id='152'
AND td.product_id='432'
) AS enumerated
WHERE row_number <= 2
GROUP BY partner_id
See also PostgreSQL: top n entries per item in same table or Optimize GROUP BY query to retrieve latest row per user

Implement ROW_NUMBER() in beamSQL

I have the below query :
SELECT DISTINCT Summed, ROW_NUMBER () OVER (order by Summed desc) as Rank from table1
I have to write it in Apache Beam(beamSql). Below is my code :
PCollection<BeamRecord> rec_2_part2 = rec_2.apply(BeamSql.query("SELECT DISTINCT Summed, ROW_NUMBER(Summed) OVER (ORDER BY Summed) Rank1 from PCOLLECTION "));
But I'm getting the below error :
Caused by: java.lang.UnsupportedOperationException: Operator: ROW_NUMBER is not supported yet!
Any idea how to implement ROW_NUMBER() in beamSql ?
Here is one way you can approximate your current query without using ROW_NUMBER:
SELECT
t1.Summed,
(SELECT COUNT(*) FROM (SELECT DISTINCT Summed FROM table1) t2
WHERE t2.Summed >= t1.Summed) AS Rank
FROM
(
SELECT DISTINCT Summed
FROM table1
) t1
The basic idea is to first subquery to get a table with only distinct Summed values. Then, use a correlated subquery to simulate the row number. This isn't a very efficient method, but if ROW_NUMBER is not available, then you're stuck with some alternative.
The solution which worked for the above query:
PCollection<BeamRecord> rec_2 = rec_1.apply(BeamSql.query("SELECT max(Summed) as maxed, max(Summed)-10 as least, 'a' as Dummy from PCOLLECTION"));

t-sql how to select records without a duplicated one column

I want to select rows for all employess without repeating the data in one column.
For example I have two rows where salary (before raise) is displayed, how can I display only the largest figure without duplication.
You can use Row_Number function
Here is a sample code
select * from (
select *,
row_number() over (partition by empid, name, department order by salary desc) as rn
from employee
) employee where rn = 1
You can find Row_Number() with Partition By clause sample at http://www.kodyaz.com
If I'm understanding the question correctly, then a simple MAX function and GROUP BY would work.
SELECT EmployeeId, OtherColumns, MAX(Salary)
FROM tblEmployees
GROUP BY EmployeeId, OtherColumns

Sorting in CTE expression

I am retreiving all users from DB ordered by number of followers for each user DESC
with TH_Users as
(
SELECT [ID]
,[FullName]
,[UserName]
,[ImageName]
,dbo.GetUserFollowers(ID) AS Followers
, ROW_NUMBER() OVER (order by ID ) AS 'RowNumber'
from dbo.TH_Users
Where CultureID = #cultureID
)
Select ID,[FullName]
,[UserName]
,[ImageName], Followers from TH_Users
Where RowNumber BETWEEN #startIdx AND #endIdx
Order BY Followers DESC
I am using a function to get number of followers for each user. now is I user Followers column as the column order for ROW_NUMBER() OVER (order by Followers ) AS 'RowNumber'
I get a compilation error.
Putting Order BY Followers DESC at the end of the query will not give the intended result.
Any suggestions ?
Thanks
When you use AS to give an alias to a column, that alias is not available within the query - logically, applying aliases to columns is (almost) the very last part of evaluating a query.
So if you want your ROW_NUMBER with the CTE to be OVER what you alias as Followers, you must express it in the same terms as the column itself:
;with TH_Users as
(
SELECT [ID]
,[FullName]
,[UserName]
,[ImageName]
,dbo.GetUserFollowers(ID) AS Followers
, ROW_NUMBER() OVER (order by dbo.GetUserFollowers(ID) ) AS 'RowNumber'
from dbo.TH_Users
Where CultureID = #cultureID
)
Note that this will not cause the function to be evaluated any more times than it is currently.
(I have not tested this)