UNION and BETWEEN dates - postgresql

I have two tables, people and shifts and for every person I want to
get all shifts for a week.
The problem is that there doesn't have to be a shift for every date.
In case there is no shift I want to get a dynamic template result with the date where no shift is availabe
SELECT p.id, p.name, s.date_of_shift
FROM people AS p
LEFT JOIN LATERAL (
SELECT sh.id, sh.date_of_shift, sh.person_id
FROM shifts as sh
) AS s ON p.id = s.person_id
WHERE p.id = 2 AND s.date_of_shift BETWEEN '2016-03-21' AND '2016-03-25'
UNION ALL
SELECT null, null, '2016-03-21'
WHERE NOT EXISTS (
SELECT 1
FROM people AS p
LEFT JOIN LATERAL (
SELECT sh.id, sh.date_of_shift, sh.person_id
FROM shifts AS sh
) AS s ON p.id = s.person_id
WHERE p.id = 88000 AND s.date_of_shift BETWEEN '2016-03-21' AND '2016-03-25');
This is the query I managed to create. The problem is that I always get the same date. But I want the date in the between range where no shift is.

In a case like this where you want all dates in a range, even when there is possibly no data for a specific date, you should use the generate_series() function and LEFT JOIN your data to it:
SELECT DISTINCT p.id, p.name, date_of_shift
FROM generate_series('2016-03-21'::date, '2016-03-25', interval '1 day') AS d(date_of_shift)
LEFT JOIN shifts sh USING (date_of_shift)
LEFT JOIN (SELECT id, name FROM person WHERE id = 2) p ON p.id = sh.person_id;
SQLFiddle

Related

Correlated subquery in Postgres

I have a query like below to find the stock details of certain products.The query is working fine but i think it is not efficient and fast enough(DB: postgresql version 11).
There is a CTE "result_set"in this code where i need to find the "quantity of a product ordered"(qty_last_7d_from_oos_date) during the period between out of stock and last 7 days before out of stock date.Same like this i have to find the revenue also.
So what i did is wrote a same subquery two times one outputting the revenue and other the quantity which is not an efficient step.So someone have any suggestions on how to rewrite this and make it an efficient code.
WITH final as
(
SELECT product_id,product_name,item_sku,out_of_stock_at
,out_of_stock_at - INTERVAL '7 days' as previous_7_days
,back_in_stock_at
FROM oos_base
)
SELECT product_id,product_name,item_sku,out_of_stock_at,previous_7_days
,back_in_stock_at
,(SELECT coalesce(sum(i.qty_ordered), 0) AS qty_last_7d_from_oos_date
FROM ol.orders o
LEFT JOIN ol.items i ON i.order_id = o.order_id
LEFT JOIN ol.products p ON p.product_id = i.product_id AND i.store_id = p.store_id
WHERE o.order_state_2 IN('complete','processing')
AND f.product_id=p.product_id
AND o.created_at_order :: DATE BETWEEN f.previous_7_days::DATE AND COALESCE(f.out_of_stock_at::DATE,current_date)
)
,( SELECT coalesce(sum(i.row_amount_minus_discount_order), 0) AS rev_last_7d_from_oos_date
FROM ol.orders o
LEFT JOIN ol.items i ON i.order_id = o.order_id
LEFT JOIN ol.products p ON p.product_id = i.product_id AND i.store_id = p.store_id
WHERE o.order_state_2 IN('complete','processing')
AND f.product_id=p.product_id
AND o.created_at_order :: DATE BETWEEN f.previous_7_days::DATE AND COALESCE(f.out_of_stock_at::DATE,current_date)
)
FROM final f
In the above code the CTE "final" gives you two dates "out_of_stock_at" &
"previous_7_days". I want to find the quantity and revenue of a product based on this 2 dates means between "previous_7_days" & "out_of_stock_at".
Below query will give the quantity and revenue of the products but the period between "previous_7_days" & "out_of_stock_at"from the above CTE.
As of now i have used the below code two times to obtain the information of revenue and quantity.
SELECT coalesce(sum(i.qty_ordered), 0) AS qty ,
coalesce(sum(i.row_amount_minus_discount_order), 0)
FROM ol.orders o
LEFT JOIN ol.items i ON i.order_id = o.order_id
LEFT JOIN ol.products p ON p.product_id = i.product_id AND i.store_id = p.store_id
WHERE o.order_state_2 IN('complete','processing')
AND f.product_id=p.product_id
AND o.created_at_order :: DATE BETWEEN f.previous_7_days::DATE AND COALESCE(f.out_of_stock_at::DATE,current_date)

How to use DISTINCT ON in ARRAY_AGG()?

I have the following query:
SELECT array_agg(DISTINCT p.id) AS price_ids,
array_agg(p.name) AS price_names
FROM items
LEFT JOIN prices p on p.item_id = id
LEFT JOIN third_table t3 on third_table.item_id = id
WHERE id = 1;
When I LEFT JOIN the third_table all my prices are duplicated.
I'm using DISTINCT inside ARRAY_AGG() to get the ids without dups, but I want the names without dups aswell.
If I use array_agg(DISTINCT p.name) AS price_names, it will return distinct values based on the name, not the id.
I want to do something similar to array_agg(DISTINCT ON (p.id) p.name) AS price_names, but it is invalid.
How can I use DISTINCT ON inside ARRAY_AGG()?
Aggregate first, then join:
SELECT p.price_ids,
p.price_names,
t3.*
FROM items
LEFT JOIN (
SELECT pr.item_id,
array_agg(pr.id) AS price_ids,
array_agg(pr.name) AS price_names
FROM prices pr
GROUP BY pr.item_id
) p on p.item_id = items.id
LEFT JOIN third_table t3 on third_table.item_id = id
WHERE items.id = 1;
Using a lateral join might be faster if you only pick a single item:
SELECT p.price_ids,
p.price_names,
t3.*
FROM items
LEFT JOIN LATERAL (
SELECT array_agg(pr.id) AS price_ids,
array_agg(pr.name) AS price_names
FROM prices pr
WHERE pr.item_id = items.id
) p on true
LEFT JOIN third_table t3 on third_table.item_id = id
WHERE items.id = 1;

Strange Behaviour on Postgresql query

We created a view in Postgres and I am getting strange result.
View Name: event_puchase_product_overview
When I try to get records with *, I get the correct result. but when I try to get specific fields, I get wrong values.
I hope the screens attached here can explain the problem well.
select *
from event_purchase_product_overview
where id = 15065;
select id, departure_id
from event_puchase_product_overview
where id = 15065;
VIEW definition:
CREATE OR REPLACE VIEW public.event_puchase_product_overview AS
SELECT row_number() OVER () AS id,
e.id AS departure_id,
e.type AS event_type,
e.name,
p.id AS product_id,
pc.name AS product_type,
product_date.attribute AS option,
p.upcomming_date AS supply_date,
pr.date_end AS bid_deadline,
CASE
WHEN (pt.categ_id IN ( SELECT unnest(tt.category_ids) AS unnest
FROM ( SELECT string_to_array(btrim(ir_config_parameter.value, '[]'::text), ', '::text)::integer[] AS category_ids
FROM ir_config_parameter
WHERE ir_config_parameter.key::text = 'trip_product_flight.product_category_hotel'::text) tt)) THEN e.maximum_rooms
WHEN (pt.categ_id IN ( SELECT unnest(tt.category_ids) AS unnest
FROM ( SELECT string_to_array(btrim(ir_config_parameter.value, '[]'::text), ', '::text)::integer[] AS category_ids
FROM ir_config_parameter
WHERE ir_config_parameter.key::text = 'trip_product_flight.product_category_flight'::text) tt)) THEN e.maximum_seats
WHEN (pt.categ_id IN ( SELECT unnest(tt.category_ids) AS unnest
FROM ( SELECT string_to_array(btrim(ir_config_parameter.value, '[]'::text), ', '::text)::integer[] AS category_ids
FROM ir_config_parameter
WHERE ir_config_parameter.key::text = 'trip_product_flight.product_category_bike'::text) tt)) THEN e.maximum_bikes
ELSE e.maximum_seats
END AS departure_qty,
CASE
WHEN now()::date > pr.date_end AND po.state::text = 'draft'::text THEN true
ELSE false
END AS is_deadline,
pl.product_qty::integer AS purchased_qty,
pl.comments,
pl.price_unit AS unit_price,
rp.id AS supplier,
po.id AS po_ref,
po.state AS po_state,
po.date_order AS po_date,
po.user_id AS operator,
pl.po_state_line AS line_status
FROM event_event e
LEFT JOIN product_product p ON p.related_departure = e.id
LEFT JOIN product_template pt ON pt.id = p.product_tmpl_id
LEFT JOIN product_category pc ON pc.id = pt.categ_id
LEFT JOIN purchase_order_line pl ON pl.product_id = p.id
LEFT JOIN purchase_order po ON po.id = pl.order_id
LEFT JOIN purchase_order_purchase_requisition_rel prr ON prr.purchase_order_id = po.id
LEFT JOIN purchase_requisition pr ON pr.id = prr.purchase_requisition_id
LEFT JOIN res_partner rp ON rp.id = po.partner_id
LEFT JOIN ( SELECT p_1.id AS product_id,
pav.name AS attribute
FROM product_product p_1
LEFT JOIN product_attribute_value_product_product_rel pa ON pa.prod_id = p_1.id
LEFT JOIN product_attribute_value pav ON pav.id = pa.att_id
LEFT JOIN product_attribute pat ON pat.id = pav.attribute_id
WHERE pat.name::text <> ALL (ARRAY['Date'::character varying, 'Departure'::character varying]::text[])) product_date ON product_date.product_id = p.id
WHERE (p.id IN ( SELECT DISTINCT mrp_bom_line.product_id
FROM mrp_bom_line)) AND p.active
ORDER BY e.id, pt.categ_id, p.id;
If I add new event_event or new product_product I'll get a new definition of row_number in my view, then the column ID of my view is not stable.
at least you can't use row_number as Id of the view,
If you insist to use row_number, you can use the Order By "creation DATE" by this way all new records will be as last lines in the view and this will not change the correspondency between ID (row_number) and other columns.
Hope that helps !
Very likely the execution plan of your query depends on the columns you select. Compare the execution plans!
Your id is generated using the row_number window function. Now window functions are executed before the ORDER BY clause, so the order will depend on the execution plan and hence on the columns you select.
Using row_number without an explicit ordering doesn't make any sense.
To fix that, don't use
row_number() OVER ()
but
row_number() OVER (ORDER BY e.id, pt.categ_id, p.id)
so that you have a reliable ordering.
In addition, you should omit the ORDER BY clause at the end.

IN list in subquery and mainquery leads to duplication

I started out with a query that gives me all shifts for a person.
SELECT
p.id as person_id, json_agg(sh) as shifts
FROM
people as p,
(SELECT s.id, s.date_of_shift, s.shift_type_id
FROM people as p
LEFT JOIN shifts as s
ON s.person_id = p.id AND s.date_of_shift BETWEEN '2016-02-11' AND '2016-02-17'
WHERE p.id = 2001
ORDER BY p.id
) as sh
WHERE
p.id = 2001
GROUP BY
p.id
;
the result would be something like this:
person_id | shifts
-----------+------------------------------------------------------------------
2001 | [{"id":580069,"date_of_shift":"2016-02-11","shift_type_id":44},+
{"id":580070,"date_of_shift":"2016-02-12","shift_type_id":42}, +
{"id":580071,"date_of_shift":"2016-02-15","shift_type_id":49}, +
{"id":580072,"date_of_shift":"2016-02-16","shift_type_id":41}, +
{"id":580073,"date_of_shift":"2016-02-17","shift_type_id":48}]
so I got 1 row, the first column is the person id and the 2nd is the json with the array of shifts.
The next step would be, to give the query a list of person_ids and get something like this
person_id | shifts
----------|--------------
2001 | [{..},]
2002 | [{..},]
2003 | [{..},]
so I ran this:
SELECT
p.id as person_id, json_agg(sh) as shifts
FROM
people as p,
(SELECT s.id, s.date_of_shift, s.shift_type_id
FROM people as p
LEFT JOIN shifts as s
ON s.person_id = p.id AND s.date_of_shift BETWEEN '2016-02-11' AND '2016-02-17'
WHERE p.id IN (2201,2202,2203)
ORDER BY p.id
) as sh
WHERE
p.id IN (2201,2202,2203)
GROUP BY
p.id
;
The problem is that I now get all shifts for every person inside the subquery.
So in this case I get 15 shifts for every person_id, instead of 5.
I understand why I get this result, but I'm stuck on how to get the result I'm look for.
You can use a LATERAL join:
SELECT p.id as person_id, json_agg(sh) as shifts
FROM people as p,
LATERAL (
SELECT s.person_id, s.date_of_shift, s.shift_type_id
FROM shifts as s
WHERE s.person_id = p.id AND
s.date_of_shift BETWEEN '2016-02-11' AND '2016-02-17') as sh
WHERE p.id IN (2201,2202,2203)
GROUP BY p.id;
This way the subquery is simplified: you don't to perform a LEFT JOIN operation inside it, as you can access table people that lies outside the scope of the subquery.
Demo here
If you want to always get all rows of table people irrespective if there are matching rows in shifts table, then you can use a LEFT JOIN LATERAL:
SELECT p.id as person_id, json_agg(sh) as shifts
FROM people as p
LEFT JOIN LATERAL (
SELECT person_id, date_of_shift, shift_type_id
FROM shifts) as sh
ON sh.person_id = p.id AND
sh.date_of_shift BETWEEN '2016-02-11' AND '2016-02-17'
WHERE p.id IN (2201,2202,2203, 2204)
GROUP BY p.id;
Demo here

How should I add fields without adding them to a GROUP BY?

I have a SQL statement that works as-is. I get an area name and the minimum value within that area. next, I need to add in a key so I can actually do something with the results. The key is necessary since names and values are unlikely to be unique.
select g.name, min(g.rndval) from
(
select p.rndval, a.name, p.id
from points p, areas a
where ST_WITHIN(p.geom, a.geom)
) AS g
group by g.name
When I add the Id field to the group by, the query returns multiple rows for each area, as expected since it's grouping by the name and id combination, and the results are no longer what I need. How should I add in the id field (p.id in the inner select)?
You can try:
WITH cte AS
( select p.rndval, a.name, p.id
from points p, areas a
where ST_WITHIN(p.geom, a.geom)
), cte_aggregated AS
(
SELECT name, min(rndval) AS min_value
FROM cte
GROUP BY name
)
SELECT DISTINCT c.rndval, c.name, c.id
FROM cte c
JOIN cte_aggregated ca
ON c.rndval = ca.min_value
AND c.name = ca.name;
You can solve this quite elegantly with a window function:
select name, rndval as min, id
from (
select a.name, p.rndval, p.id, rank() over (partition by a.name order by p.rndval) as rnk
from points p
join areas a on ST_Within(p.geom, a.geom)) as g
where rnk = 1;