Query tuning postgres - postgresql

I'm trying to reduce response time for below postgresql query from current 5 seconds to 1...attaching explain plan too for this query..please help...
(
SELECT
1 AS RowNumber
,'Total Countries' AS RowLabel
,COUNT(DISTINCT ITS.abc CountryTrading) AS Aggregation
FROM ObjectViews.abc InstnTradeSummary AS ITS
WHERE ITS.KeyInstn = 7402194
AND ITS.TradeDataMonthYearPublish >= date_trunc('month', current_date) + interval '-5 years'
AND ITS.TradeDataMonthYearPublish <= date_trunc('month', current_date)
AND ITS.abc CountryTrading IS NOT NULL
GROUP BY ITS.KeyInstn
UNION
SELECT
2 AS RowNumber
,'Total Shipments' AS RowLabel
,SUM(ITS.ShipmentCount) AS TotalShipments
FROM ObjectViews.abc InstnTradeSummary AS ITS
WHERE ITS.KeyInstn = 7402194
AND ITS.TradeDataMonthYearPublish >= date_trunc('month', current_date) + interval '-5 years'
AND ITS.TradeDataMonthYearPublish <= date_trunc('month', current_date)
GROUP BY ITS.KeyInstn
UNION
SELECT
3 AS RowNumber
,'Total Weight in kg' AS RowLabel
,SUM(COALESCE(ITS.ShipmentWeightAR, ITS.ShipmentWeightEst)) AS TotalShipmentWeight
FROM ObjectViews.abc InstnTradeSummary AS ITS
WHERE ITS.KeyInstn = 7402194
AND ITS.TradeDataMonthYearPublish >= date_trunc('month', current_date) + interval '-5 years'
AND ITS.TradeDataMonthYearPublish <= date_trunc('month', current_date)
GROUP BY ITS.KeyInstn
UNION
SELECT
4 AS RowNumber
,'Total Volume in TEU' AS RowLabel
,SUM(COALESCE(ITS.ShipmentVolumeAR, ITS.ShipmentVolumeEst)) AS TotalShipmentVolume
FROM ObjectViews.abc InstnTradeSummary AS ITS
WHERE ITS.KeyInstn = 7402194
AND ITS.TradeDataMonthYearPublish >= date_trunc('month', current_date) + interval '-5 years'
AND ITS.TradeDataMonthYearPublish <= date_trunc('month', current_date)
GROUP BY ITS.KeyInstn
) ORDER BY RowNumber
Below is explain plan for the query...
https://explain.depesz.com/s/xgC2

Read the table once, do your formatting after:
SELECT
v.row_number,
v.row_label,
CASE v.row_number
WHEN 1 THEN s.total_countries
WHEN 2 THEN s.total_shipments
WHEN 3 THEN s.total_shipment_weight
ELSE s.total_shipment_volume
END AS total
FROM (
VALUES
(1, 'Total Countries'),
(2, 'Total Shipments'),
(3, 'Total Weight in kg'),
(4, 'Total Volume in TEU')
) AS v(row_number, row_label)
LEFT JOIN (
SELECT
COUNT(DISTINCT ITS.abc CountryTrading) FILTER (WHERE ITS.abc CountryTrading IS NOT NULL) AS total_countries,
SUM(ITS.ShipmentCount) AS total_shipments,
SUM(COALESCE(ITS.ShipmentWeightAR, ITS.ShipmentWeightEst)) AS total_shipment_weight,
SUM(COALESCE(ITS.ShipmentVolumeAR, ITS.ShipmentVolumeEst)) AS total_shipment_volume
FROM ObjectViews.abc InstnTradeSummary AS ITS
WHERE ITS.KeyInstn = 7402194
AND ITS.TradeDataMonthYearPublish >= date_trunc('month', current_date) + interval '-5 years'
AND ITS.TradeDataMonthYearPublish <= date_trunc('month', current_date)
GROUP BY ITS.KeyInstn
) AS s ON TRUE
ORDER BY v.row_number

Related

Query max and min in a fixed range of time of each day interval POSTGRES

I have a query below to query max and min of day interval in a range of time ( current_date - 2 to current_date - 1). Now, I need to query dayshift and extra shift separately ( dayshift from 5am to 3pm, extra shift will be the remains).
select sum(gap) from (
select to_char(time_stamp, 'yyyy/mm/dd') as day,
EXTRACT(EPOCH FROM (max(time_stamp) - min(time_stamp))) /3600 as gap
from group_table_debarker
where time_stamp >= (current_date - 2)
and time_stamp <= (current_date - 1)
and to_char(time_stamp, 'hh:mi') > '03:00' and to_char(time_stamp, 'hh:mi') < '15:00'
group by to_char(time_stamp, 'yyyy/mm/dd')
) as xxx
select sum(gap) from (
select to_char(time_stamp, 'yyyy/mm/dd') as day,
EXTRACT(EPOCH FROM (max(time_stamp) - min(time_stamp))) /3600 as gap
from group_table_debarker
where time_stamp >= (current_date - 2)
and time_stamp <= (current_date - 1)
and to_char(time_stamp, 'hh:mi') > '03:00' and to_char(time_stamp, 'hh:mi') < '15:00'
group by to_char(time_stamp, 'yyyy/mm/dd')
) as xxx
I've tried this but result wasn't expected

Postgresql, Get the top 5 products that have increased in value from yesterday to today, returning the delta

I have a pricing table that contains the pricing data for products. There are around 600 unique product_id, each currently having 4 days worth of pricing data, which will eventually go up to 30 days. The table below is a small subset of the data to represent that table structure:
date
product_id
price_trend
2022-08-21
1
0.08
2022-08-22
1
0.18
2022-08-23
1
0.30
2022-08-21
2
0.15
2022-08-22
2
0.20
2022-08-23
2
0.22
So in my script, for each product_id I am trying to get yesterdays price_trend and todays price_trend and then calculate the price_delta between the two. I then order by price_delta and limit the results to 5.
I am having some issues as in some cases yesterdays price_trend is 0 and then todays price trend is 0.50 for example. This does not mean that the price trend has increased, but mostly likely that price_trend was not gathered yesterday for whatever reason.
Now I would like to remove any records where price_trend for today or yesterday equals 0, however, when I add AND pricing.trend_price > 0 the value return is just null instead.
Script:
SELECT
magic_sets_cards.name,
(SELECT pricing.trend_price
FROM pricing
WHERE pricing.product_id = magic_sets_cards_identifiers.mcm_id
AND pricing.date = (SELECT MAX(date) - INTERVAL '2 DAY' FROM pricing)
AND pricing.trend_price > 0) AS price_yesterday,
(SELECT pricing.trend_price
FROM pricing
WHERE pricing.product_id = magic_sets_cards_identifiers.mcm_id
AND pricing.date = (SELECT MAX(date) FROM pricing)
AND pricing.trend_price > 0) AS price_today,
((SELECT pricing.trend_price
FROM pricing
WHERE pricing.product_id = magic_sets_cards_identifiers.mcm_id
AND pricing.date = (SELECT MAX(date) FROM pricing)) -
(SELECT pricing.trend_price
FROM pricing
WHERE pricing.product_id = magic_sets_cards_identifiers.mcm_id
AND pricing.date = (SELECT MAX(date) - INTERVAL '2 DAY' FROM pricing))) AS price_delta
FROM magic_sets
JOIN magic_sets_cards ON magic_sets_cards.set_id = magic_sets.id
JOIN magic_sets_cards_identifiers ON magic_sets_cards_identifiers.card_id = magic_sets_cards.id
JOIN pricing ON pricing.product_id = magic_sets_cards_identifiers.mcm_id
WHERE magic_sets.code = '2X2'
AND pricing.date = (SELECT MAX(date) FROM pricing)
ORDER BY price_delta DESC
LIMIT 5
Results:
name
price_yesterday
price_today
price_delta
"Fiery Justice"
null
0.50
0.50
"Hostage Taker"
3.50
4.00
0.50
"Damnation"
17.02
17.33
0.31
"Bring to Light"
0.42
0.72
0.30
"City of Brass"
17.41
17.68
0.27
I would like to get it so that the "Fiery Justice" in this example is just ignored.
with the use of rank() you can get the output ., Look into...
Query without null rows :
with cte as (Select
product_id,
SUM(CASE WHEN rank = 1 THEN price_trend ELSE null END) today,
SUM(CASE WHEN rank = 2 THEN price_trend ELSE null END) yesterday,
SUM(CASE WHEN rank = 1 THEN price_trend ELSE 0 END) -
SUM(CASE WHEN rank = 2 THEN price_trend ELSE 0 END) as diff
FROM (
SELECT
product_id,
price_trend,
date,
rank() OVER (PARTITION BY product_id ORDER BY date DESC) as rank
FROM tableName where price_trend>0 and date between current_date-5 and current_date-4) p
WHERE rank in (1,2)
GROUP BY product_id
) select * from cte where (case when today is null or yesterday is null then 'NULL' else 'VALID' end)!='NULL'
Query with null values :
Select
product_id,
SUM(CASE WHEN rank = 1 THEN price_trend ELSE 0 END) today,
SUM(CASE WHEN rank = 2 THEN price_trend ELSE 0 END) yesterday,
SUM(CASE WHEN rank = 1 THEN price_trend ELSE 0 END) -
SUM(CASE WHEN rank = 2 THEN price_trend ELSE 0 END) as diff
FROM (
SELECT
product_id,
price_trend,
date,
rank() OVER (PARTITION BY product_id ORDER BY date DESC) as rank
FROM tableName where date between current_date-5 and current_date-4) p
WHERE rank in (1,2)
GROUP BY product_id
Change the condition :
where date between current_date-3 and current_date-2
OUTPUT :
product_id today yesterday diff
1 0.06 0.02 0.04
2 0.64 0.62 0.02
CREATE TABLE tableName
(
date date,
product_id int,
price_trend numeric(9,2)
);
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-21 ', '1 ', '0.02');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-22 ', '1 ', '0.06');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-23 ', '1 ', '0.10');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-24 ', '1 ', '0.13');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-25 ', '1 ', '0.18');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-26 ', '1 ', '0.30');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-21 ', '2 ', '0.62');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-22 ', '2 ', '0.64');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-23 ', '2 ', '0.69');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-24 ', '2 ', '0.78');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-25 ', '2 ', '0.88');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-26 ', '2 ', '0.90');

Postgresql compare two select result on the same table

I compare results from two selects and get 1 or 0 as a final result.
Below query syntax is good but this query causes timeout.
SELECT (CASE WHEN (
select count(*) from order where ordered_date > (NOW() - INTERVAL '120 minutes')
and order_ordered = current_date) >
(select count(*)/3
from order
where ordered_date > (NOW() - INTERVAL '2 days' - INTERVAL '120 minutes')
and ordered_date < (NOW() - INTERVAL '2 days'))
THEN 1 ELSE 0 end);
Therefore, i try to optimize the query to use an alias for each select as below :
select (case when a > b then 1 else 0 end) from (select count(*) from order where ordered_date > (NOW() - INTERVAL '120 minutes')
and order_ordered = current_date) as a,
from (select count(*)/3
from order
where ordered_date > (NOW() - INTERVAL '2 days' - INTERVAL '120 minutes')
and ordered_date < (NOW() - INTERVAL '2 days'))as b;
I have syntax error near "from", in my memory this kind of syntax works on mysql.
Could you please advise me if there a possiblity to use two times of "from" by using alias on Postgresql or if you know another possility i am a taker.
Sample:
First query gives : select count(*) from order where ordered_date > (NOW() - INTERVAL '120 minutes') and order_ordered = current_date -> 60
Seconde query gives : select count(*)/3 from order where ordered_date > (NOW() - INTERVAL '2 days' - INTERVAL '120 minutes') and ordered_date < (NOW() - INTERVAL '2 days') -> 20
Final condition : case when (60 > 20 then 1 else 0 end)
Result expected : 1
Thanks
I suggest using SELECT in WITH (here documentation).
WITH orders_current_date AS (
SELECT count(*)
FROM order
WHERE ordered_date > (NOW() - INTERVAL '120 minutes')
AND order_ordered = current_date)
), orders_interval AS (
SELECT count(*)/3
FROM order
WHERE ordered_date > (NOW() - INTERVAL '2 days' - INTERVAL '120 minutes')
AND ordered_date < (NOW() - INTERVAL '2 days')
)
SELECT
CASE
WHEN SELECT * FROM orders_current_date > SELECT * FROM orders_interval
THEN '1'
ELSE
0
END;

Postgres distinct union only for specific columns

I have two sets of data, one of which is dynamically generated.
If I leave off the column state it works perfectly as that column doesn't really exist, my question is how can I ignore a column for the UNION so that it combines the two datasets (as it is it's the same as UNION ALL). eg I prefer the first table and want any rows from the second dataset ignored if they exist in the first one.
SELECT event_id, start_at, state
FROM event_logs
WHERE start_at BETWEEN current_date AND current_date + interval '3 weeks'
UNION
SELECT id event_id,
GENERATE_SERIES(date_trunc('week', current_date)::date + (extract(isodow from start_at)::int - 1) + start_at::time, current_date + interval '3 weeks', '1 week'::INTERVAL) AS start_at,
'draft' AS state
FROM events
Update, also tried:
WITH future_logs AS (
SELECT id event_id,
GENERATE_SERIES(date_trunc('week', current_date)::date + (extract(isodow from start_at)::int - 1) + start_at::time, current_date + interval '3 weeks', '1 week'::INTERVAL) AS start_at,
'draft' AS state
FROM events)
SELECT future_logs.event_id, future_logs.start_at, future_logs.state
FROM future_logs
LEFT JOIN event_logs ON future_logs.event_id = event_logs.event_id AND future_logs.start_at = event_logs.start_at
WHERE event_logs.start_at BETWEEN current_date AND current_date + interval '3 weeks'
But got too few results 77 vs ~1000 expected.
Just add NOT EXISTS() to the second leg, and you can use UNION ALL to avoid sort/merging.
SELECT event_id, start_at, state
FROM event_logs
WHERE start_at BETWEEN current_date AND current_date + interval '3 weeks'
UNION ALL
SELECT id AS event_id
, generate_series(date_trunc('week', current_date)::date + (extract(isodow from start_at)::int - 1) + start_at::time
, current_date + interval '3 weeks'
, '1 week'::INTERVAL) AS start_at
, 'draft' AS state
FROM events ev
WHERE NOT EXISTS ( SELECT*
FROM event_logs nx
WHERE nx.event_id =ev.id
AND nx.start_at BETWEEN current_date AND current_date + interval '3 weeks' )
;
select DISTINCT ON (date_day) date_day, state from(
SELECT day::date as date_day, null as state
FROM generate_series(now()- interval '2 week'
, now()
, interval '1 day') day
UNION ALL
select distinct
date_trunc('day',e.updated_at) as date_day,
max(des.state) over (partition by date_trunc('day',des.updated_at)) as state
from device_event as des where e.id=49 and e.updated_at >= now() - interval '2 week'
) dba order by 1
I would add one other column taborder into your UNION query to ensure simple ordering of the rows and use window function row_number() over(...) in following way:
SELECT
event_id,
start_at,
state
FROM (
SELECT
event_id,
start_at,
state,
row_number(*) OVER (PARTITION BY event_id, start_at ORDER BY taborder) AS rownum
FROM (
SELECT
event_id,
start_at,
state,
1 AS taborder
FROM original_table
UNION
SELECT
event_id,
start_at,
state,
2 AS taborder
FROM draft_table
) src0
) src1
WHERE rownum = 1
ORDER BY 1, 2, 3

Unify select sql. Postgres

I can unify the two select below in a single, where in the first column return the result of the first and second column the result of the second.
select count(*) from rrhh.empleado where fecha_contratado > current_date - interval '100 days'; // select1
select count(*) from rrhh.empleado where fecha_fin_contrato > current_date - interval '100 days'; //select2
Thank you
try:
with a as (
select
case when fecha_contratado > current_date - interval '100 days' then 1
else 0 end q1
, case when fecha_fin_contrato > current_date - interval '100 days' then 1
else 0 end q2
from rrhh.empleado
)
select sum(q1), sum(q2)
from a
;
This is a typical case for conditional aggregation:
select count(*) filter (where fecha_contratado > current_date - interval '100 days'),
count(*) filter (where fecha_fin_contrato > current_date - interval '100 days')
from rrhh.empleado
You can use the CASE expression (and the fact that most aggregates does not use NULL values) for versions earlier than 9.4:
select count(case when fecha_contratado > current_date - interval '100 days' then 1 end),
count(case when fecha_fin_contrato > current_date - interval '100 days' then 1 end)
from rrhh.empleado
Note: these queries will scan the whole table, while your original queries could make use of indexes on fecha_contratado and fecha_fin_contrato. If performance matters to you, you could append a filter to these queries too:
where least(fecha_contratado, fecha_fin_contrato) > current_date - interval '100 days'
and you could index the expression: least(fecha_contratado, fecha_fin_contrato).