Query row and join many rows as JSON array - postgresql

I am looking to join three tables via ids, the outcome being three json columns with the content from each.
The issue I am facing is that for each cat_request there are many cat_request_fields, I am currently getting cat_request_fields as one object and not an array of objects.
This query gets me a result set with cat_requests and cat_request_fields.
SELECT
row_to_json("cat_requests") AS cat_request,
array_agg(row_to_json("cat_request_fields")) AS cat_request_fields
FROM
"cat_requests"
LEFT OUTER JOIN "cat_request_fields" ON "cat_requests"."id" = "cat_request_fields"."cat_request_id"
GROUP BY
"cat_requests"."id"
LIMIT 10;
This query gets me a result set with cats and cat_requests.
SELECT
row_to_json("cat_requests") as cat_request,
row_to_json("cats") as cat
FROM
"cat_requests",
"cats"
WHERE
"cat_requests"."cat_id" = "cats"."id"
LIMIT 1;
I'm looking for a query that will give me a combination of the two...
How can I modify this query to map the cat_request_fields to be an array of rows and not just one.

SELECT
row_to_json("cat_requests") AS cat_request,
(select row_to_json("cats".*) as cats from "cats" where "cats"."id" = "cat_requests"."cat_id"),
array_agg(row_to_json("cat_request_fields")) AS cat_request_fields
FROM
"cat_requests"
INNER JOIN "cat_request_fields" ON "cat_requests"."id" = "cat_request_fields"."cat_request_id"
GROUP BY
"cat_requests"."id"
LIMIT 6;

Related

Return closest timestamp from Table B based on timestamp from Table A with matching Product IDs

Goal: Create a query to pull the closest cycle count event (Table C) for a product ID based on the inventory adjustments results sourced from another table (Table A).
All records from Table A will be used, but is not guaranteed to have a match in Table C.
The ID column will be present in both tables, but is not unique in either, so that pair of IDs and Timestamps together are needed for each table.
Current simplified SQL
SELECT
A.WHENOCCURRED,
A.LPID,
A.ITEM,
A.ADJQTY,
C.WHENOCCURRED,
C.LPID,
C.LOCATION,
C.ITEM,
C.QUANTITY,
C.ENTQUANTITY
FROM
A
LEFT JOIN
C
ON A.LPID = C.LPID
WHERE
A.facility = 'FACID'
AND A.WHENOCCURRED > '23-DEC-22'
AND A.ADJREASONABBREV = 'CYCLE COUNTS'
ORDER BY A.WHENOCCURRED DESC
;
This is currently pulling the first hit on C.WHENOCCURRED on the LPID matches. Want to see if there is a simpler JOIN solution before going in a direction that creates 2 temp tables based on WHENOCCURRED.
I have a functioning INDEX(MATCH(MIN()) solution in Excel but that requires exporting a couple system reports first and is extremely slow with X,XXX row tables.
If you are using Oracle 12 or later, you can use a LATERAL join and FETCH FIRST ROW ONLY:
SELECT A.WHENOCCURRED,
A.LPID,
A.ITEM,
A.ADJQTY,
C.WHENOCCURRED,
C.LPID,
C.LOCATION,
C.ITEM,
C.QUANTITY,
C.ENTQUANTITY
FROM A
LEFT OUTER JOIN LATERAL (
SELECT *
FROM C
WHERE A.LPID = C.LPID
AND A.whenoccurred <= c.whenoccurred
ORDER BY c.whenoccurred
FETCH FIRST ROW ONLY
) C
ON (1 = 1) -- The join condition is inside the lateral join
WHERE A.facility = 'FACID'
AND A.WHENOCCURRED > DATE '2022-12-23'
AND A.ADJREASONABBREV = 'CYCLE COUNTS'
ORDER BY A.WHENOCCURRED DESC;

How to return an array from a postgres select query?

I have a table that holds scenario information and whether or not it is active and I query the scenario like so
SELECT
scenario.*,
scenario_detail.name,
scenario_detail.is_activated,
scenario_detail.is_current,
scenario_detail.input_values,
scenario_detail.scenario_id,
scenario_detail.user_id,
scenario_detail.plan_id,
scenario_detail.order_index
FROM scenario
INNER join scenario_detail
ON scenario_detail.scenario_id = scenario.id
WHERE scenario_detail.state = 'ACTIVE'
And it returns the scenarios fine.
Now I would like to return an array that will have two items, a list of all activated scenario and another of inactive scenarios.
I know I could slice and sort these scenarios in the front end but I am rather interested in doing it in the backend.
How could I do it?
If all the fields are same then you can use union.
SELECT
scenario.*,
scenario_detail.name,
scenario_detail.is_activated,
scenario_detail.is_current,
scenario_detail.input_values,
scenario_detail.scenario_id,
scenario_detail.user_id,
scenario_detail.plan_id,
scenario_detail.order_index
FROM scenario
INNER join scenario_detail
ON scenario_detail.scenario_id = scenario.id
WHERE scenario_detail.state = 'ACTIVE'
UNION
SELECT
scenario.*,
scenario_detail.name,
scenario_detail.is_activated,
scenario_detail.is_current,
scenario_detail.input_values,
scenario_detail.scenario_id,
scenario_detail.user_id,
scenario_detail.plan_id,
scenario_detail.order_index
FROM scenario
INNER join scenario_detail
ON scenario_detail.scenario_id = scenario.id
WHERE scenario_detail.state = 'INACTIVE'

Sequential scan rather than index scan

I have a bunch of tables in postgresql and I run a query as follows
SELECT DISTINCT ON ...some stuff...
FROM "rent_flats" INNER JOIN "rent_flats_linked_users"
ON "rent_flats_linked_users"."rent_flat_id" = "rent_flats"."id"
INNER JOIN "users"
ON "users"."id" = rent_flats_linked_users"."user_id"
INNER JOIN "owners"
ON "owners"."id" = "users"."profile_id" AND "users"."profile_type" = 'Owner'
INNER JOIN "phone_numbers"
ON "phone_numbers"."person_id" = "owners"."id" AND "phone_numbers"."person_type" = 'Owner'
INNER JOIN "phone_number_categories"
ON "phone_number_categories"."id" = "phone_numbers"."phone_number_category_id"
INNER JOIN "localities"
ON "localities"."id" = "rent_flats"."locality_id"
INNER JOIN "regions"
ON "regions"."id" = "localities"."region_id"
INNER JOIN "cities"
ON "cities"."id" = "regions"."city_id"
INNER JOIN "property_types"
ON "property_types"."id" = "rent_flats"."property_type_id"
INNER JOIN "apartment_types"
ON "apartment_types"."id" = "rent_flats"."apartment_type_id"
WHERE "rent_flats"."status" = 3
AND (((extract(epoch from age(current_date,rent_flats.date_added))/86400)::int) IN (cities.short_period,cities.long_period))
AND (phone_number_categories.name IN ('SMS','SMS & Mobile'))
ORDER BY rf_id, phone_numbers.priority ASC
Note: The rent_flats table contains around 5 million rows, and rent_flats_linked_users contains around 600k rows and users contains 350k rows.Other tables are small in size.
The query takes about 6.8 secs to execute and the explain analyses shows that around 50% of the total time goes in sequential scans of the rent_flats, users and rent_flats_linked_users tables and the other 30% in Hash joins.
On setting seq_scan to off...the query takes even longer to ~11 secs (in this case Hash and Hash join take upto 97.5% of the time)
Here's the explain query plan analyses.
I have put indices on the fields involved in the inner joins as well as on fields involved in the filters like phone_numbers.priority and cities.short_period and cities.long_period. But I still get a sequential scan. What can be the reasons and possible solutions to fasten the query?
I suspect that if there is a part of that query worth optimising then it is this:
(((extract(epoch from age(current_date,rent_flats.date_added))/86400)::int) IN (cities.short_period,cities.long_period))
You really need to turn that into something like:
rent_flats.date_added in (...)
Then you can index date_added, and maybe index (date_added, status).
the next step would be to make sure that the join columns are indexed.

comprare aggregate sum function to number in postgres

I have the next query which does not work:
UPDATE item
SET popularity= (CASE
WHEN (select SUM(io.quantity) from item i NATURAL JOIN itemorder io GROUP BY io.item_id) > 3 THEN TRUE
ELSE FALSE
END);
Here I want to compare each line of inner SELECT SUM value with 3 and update popularity. But SQL gives error:
ERROR: more than one row returned by a subquery used as an expression
I understand that inner SELECT returns many values, but can smb help me in how to compare each line. In other words make loop.
When using a subquery you need to get a single row back, so you're effectively doing a query for each record in the item table.
UPDATE item i
SET popularity = (SELECT SUM(io.quantity) FROM itemorder io
WHERE io.item_id = i.item_id) > 3;
An alternative (which is a postgresql extension) is to use a derived table in a FROM clause.
UPDATE item i2
SET popularity = x.orders > 3
FROM (select i.item_id, SUM(io.quantity) as orders
from item i NATURAL JOIN itemorder io GROUP BY io.item_id)
as x(item_id,orders)
WHERE i2.item_id = x.item_id
Here you're doing a single group clause as you had, and we're joining the table to be updated with the results of the group.

Postgres - Get data from each alias

In my application i have a query that do multiple joins with a table position. Just like this:
SELECT *
FROM (...) as trips
join trip as t on trips.trip_id = t.trip_id
left outer join vehicle as v on v.vehicle_id = t.trip_vehicle_id
left outer join position as start on trips.start_position_id = start.position_id and start.position_vehicle_id = v.vehicle_id
left outer join position as "end" on trips.end_position_id = "end".position_id and "end".position_vehicle_id = v.vehicle_id
left outer join position as last on trips.last_position_id = last.position_id and last.position_vehicle_id = v.vehicle_id;
My table position has 35 columns(for example position_id).
When I run the query, in result should appear the table position 3 times, start, end and last. But postgres can not distinguish between, for exemplar, start.position_id, end.position_id and last.position_id. So this 3 columns are group and appear as one, position_id.
As the data from start.position_id and end.position_id are different, the column, position_id, that appear in result, it's empty.
Without having to rename all the columns, like this: start.position_id as start_position_id.
How can i get each group of data separately, for exemple, get all columns from the table 'start'. In MYSQL i can do this operation by calling fetch_fields, and give the function an alias, like 'start'.
But i can i do this in Postgres?
Best Regards,
Nuno Oliveira
My understanding is that you can't (or find it difficult to) discern between which table each column with a shared name (such as "position_id") belongs to, but only need to see one of the sets of shared columns at any one time. If that is the case, use tablename.* in your SELECT, so SELECT trips.*, start.*... would show the columns from trips and start, but no columns from other tables involved in the join.
SELECT [...,] start.* [,...] FROM [...] atable AS start [...]