I'm not sure where to begin on solving this problem. I need to update a record that is only every 3rd Monday of the month. In Postgres can I query every 2nd or 3rd Monday, or to be a little more abstract every nth day of nth week?
I'm looking for an elegant answer with Postgresql. Right now I have something crude like this:
select d from generate_series(date_trunc('week',timestamp '2015-02-01' + interval '13 days'), timestamp '2015-02-01' + interval '1 month -1 day', interval '14 days') d;
I like to use a calendar table for queries like this one.
To select the third Monday of every month in 2015, I can query a calendar table like this.
select cal_date
from calendar
where year_of_date = 2015
and day_of_week = 'Mon'
and day_of_week_ordinal = 3
order by cal_date;
cal_date
--
2015-01-19
2015-02-16
2015-03-16
2015-04-20
2015-05-18
2015-06-15
2015-07-20
2015-08-17
2015-09-21
2015-10-19
2015-11-16
2015-12-21
Code to create a calendar table. (This is how pgAdminIII presents it through its CREATE SCRIPT menu selection.)
CREATE TABLE calendar
(
cal_date date NOT NULL,
year_of_date integer NOT NULL,
month_of_year integer NOT NULL,
day_of_month integer NOT NULL,
day_of_week character(3) NOT NULL,
day_of_week_ordinal integer NOT NULL,
iso_year integer NOT NULL,
iso_week integer NOT NULL,
cal_quarter integer,
CONSTRAINT calendar_pkey PRIMARY KEY (cal_date),
CONSTRAINT cal_quarter_check CHECK (cal_quarter =
CASE
WHEN date_part('month'::text, cal_date) >= 1::double precision AND date_part('month'::text, cal_date) <= 3::double precision THEN 1
WHEN date_part('month'::text, cal_date) >= 4::double precision AND date_part('month'::text, cal_date) <= 6::double precision THEN 2
WHEN date_part('month'::text, cal_date) >= 7::double precision AND date_part('month'::text, cal_date) <= 9::double precision THEN 3
WHEN date_part('month'::text, cal_date) >= 10::double precision AND date_part('month'::text, cal_date) <= 12::double precision THEN 4
ELSE NULL::integer
END),
CONSTRAINT cal_quarter_range CHECK (cal_quarter >= 1 AND cal_quarter <= 4),
CONSTRAINT calendar_check CHECK (year_of_date::double precision = date_part('year'::text, cal_date)),
CONSTRAINT calendar_check1 CHECK (month_of_year::double precision = date_part('month'::text, cal_date)),
CONSTRAINT calendar_check2 CHECK (day_of_month::double precision = date_part('day'::text, cal_date)),
CONSTRAINT calendar_check3 CHECK (day_of_week::text =
CASE
WHEN date_part('dow'::text, cal_date) = 0::double precision THEN 'Sun'::text
WHEN date_part('dow'::text, cal_date) = 1::double precision THEN 'Mon'::text
WHEN date_part('dow'::text, cal_date) = 2::double precision THEN 'Tue'::text
WHEN date_part('dow'::text, cal_date) = 3::double precision THEN 'Wed'::text
WHEN date_part('dow'::text, cal_date) = 4::double precision THEN 'Thu'::text
WHEN date_part('dow'::text, cal_date) = 5::double precision THEN 'Fri'::text
WHEN date_part('dow'::text, cal_date) = 6::double precision THEN 'Sat'::text
ELSE NULL::text
END),
CONSTRAINT calendar_check4 CHECK (day_of_week_ordinal =
CASE
WHEN day_of_month >= 1 AND day_of_month <= 7 THEN 1
WHEN day_of_month >= 8 AND day_of_month <= 14 THEN 2
WHEN day_of_month >= 15 AND day_of_month <= 21 THEN 3
WHEN day_of_month >= 22 AND day_of_month <= 28 THEN 4
ELSE 5
END),
CONSTRAINT calendar_check5 CHECK (iso_year::double precision = date_part('isoyear'::text, cal_date)),
CONSTRAINT calendar_check6 CHECK (iso_week::double precision = date_part('week'::text, cal_date))
)
WITH (
OIDS=FALSE
);
You also need
GRANT and REVOKE statments--very few people should be allowed to change the content of this kind of table, and
suitable CREATE INDEX statements.
Maybe try this:
SELECT *
, EXTRACT(DAY FROM gen)::int as dom -- DayOfMonth
, CEIL(EXTRACT(DAY FROM gen) / 7)::int as mow -- MonthOfWeek
from (
select generate_series(date_trunc('year', now()), date_trunc('year', now() + interval '1 year'), interval '1 day' )::date as gen
) as src
WHERE extract ('dow' from gen) = 1
AND CEIL(EXTRACT(DAY FROM gen) / 7)::int in (2,3)
Related
I need to get the difference in minutes excluding weekends (Saturday, Sunday), between 2 timestamps in postgres, but I'm not getting the expected result.
Examples:
Get diff in minutes, however, weekends are include
SELECT EXTRACT(EPOCH FROM (NOW() - '2021-08-01 08:00:00') / 60)::BIGINT as diff_in_minutes;
$ diff_in_minutes = 17566
Get diff in weekdays, excluding saturday and sunday
SELECT COUNT(*) as diff_in_days
FROM generate_series('2021-08-01 08:00:00', NOW(), interval '1d') d
WHERE extract(isodow FROM d) < 6;
$ diff_in_days = 10
Expected:
From '2021-08-12 08:00:00' to '2021-08-13 08:00:00' = 1440
From '2021-08-13 08:00:00' to '2021-08-16 08:00:00' = 1440
From '2021-08-13 08:00:00' to '2021-08-17 08:00:00' = 2880
and so on ...
the solution is:
SELECT GREATEST(COUNT(*) - 1, 0)
FROM generate_series(from_ts, to_ts, interval'1 minute') AS x
WHERE extract(isodow FROM x) <= 5
so
SELECT GREATEST(COUNT(*) - 1, 0)
FROM generate_series('2021-08-13 08:00:00'::timestamp, '2021-08-17 08:00:00', '1 minute') AS x
WHERE extract(isodow FROM x) <= 5
returns 2880
This is not an optimal solution - but I will leave finding the optimal solution as a homework for you.
First, create an SQL function
CREATE OR REPLACE FUNCTION public.time_overlap (
b_1 timestamptz,
e_1 timestamptz,
b_2 timestamptz,
e_2 timestamptz
)
RETURNS interval AS
$body$
SELECT GREATEST(interval '0 second',e_1 - b_1 - GREATEST(interval '0 second',e_1 - e_2) - GREATEST(interval '0 second',b_2 - b_1));
$body$
LANGUAGE 'sql'
IMMUTABLE
RETURNS NULL ON NULL INPUT
SECURITY INVOKER
PARALLEL SAFE
COST 100;
Then, call it like this:
WITH frame AS (SELECT generate_series('2021-08-13 00:00:00', '2021-08-17 23:59:59', interval '1d') AS d)
SELECT SUM(EXTRACT(epoch FROM time_overlap('2021-08-13 08:00:00', '2021-08-17 08:00:00',d,d + interval '1 day'))/60) AS total
FROM frame
WHERE extract(isodow FROM d) < 6
In the CTE you should round down the left/earlier of the 2 timestamps and round up the right/later of the 2 timestamps. The idea is that you should generate the series over whole days - not in the middle of the day.
When calling the time_overlap function you should use the exact values of your 2 timestamps so that it properly calculates the overlapping in minutes between each day of the generated series and the given timeframe between your 2 timestamps.
In the end, when you sum over all the overlappings - you will get the total number of minutes excluding the weekends.
DB-Fiddle
CREATE TABLE sales (
id SERIAL PRIMARY KEY,
country VARCHAR(255),
sales_date DATE,
sales_volume DECIMAL,
fix_costs DECIMAL
);
INSERT INTO sales
(country, sales_date, sales_volume, fix_costs
)
VALUES
('DE', '2020-01-03', '500', '2000'),
('FR', '2020-01-03', '350', '2000'),
('None', '2020-01-31', '0', '2000'),
('DE', '2020-02-15', '0', '5000'),
('FR', '2020-02-15', '0', '5000'),
('None', '2020-02-29', '0', '5000'),
('DE', '2020-03-27', '180', '4000'),
('FR', '2020-03-27', '970', '4000'),
('None', '2020-03-31', '0', '4000');
Expected Result:
sales_date | country | sales_volume | fix_costs
-------------|--------------|------------------|------------------------------------------
2020-01-03 | DE | 500 | 37.95 (= 2000/31 = 64.5 x 0.59)
2020-01-03 | FR | 350 | 26.57 (= 2000/31 = 64.5 x 0.41)
-------------|--------------|------------------|------------------------------------------
2020-02-15 | DE | 0 | 86.21 (= 5000/28 = 172.4 x 0.50)
2020-02-15 | FR | 0 | 86.21 (= 5000/28 = 172.4 x 0.50)
-------------|--------------|------------------|------------------------------------------
2020-03-27 | DE | 180 | 20.20 (= 4000/31 = 129.0 x 0.16)
2020-03-27 | FR | 970 | 108.84 (= 4000/31 = 129.0 x 0.84)
-------------|--------------|------------------|-------------------------------------------
The column fix_costs in the expected result is calculated as the following:
Step 1) Get the daily rate of the fix_costs per month.(2000/31 = 64.5; 5000/29 = 172.4; 4000/31 = 129.0)
Step 2) Split the daily value to the countries DE and FR based on their share in the sales_volume. (500/850 = 0.59; 350/850 = 0.41; 180/1150 = 0.16; 970/1150 = 0.84)
Step 3) In case the sales_volume is 0 the daily rate gets split 50/50 to DE and FR as you can see for 2020-02-15.
In MariaDB I was able to this with the below query:
SELECT
s.sales_date,
s.country,
s.sales_volume,
(CASE WHEN SUM(sales_volume) OVER (PARTITION BY sales_date) > 0
THEN ((s.fix_costs/ DAY(LAST_DAY(sales_date))) *
sales_volume / NULLIF(SUM(sales_volume) OVER (PARTITION BY sales_date), 0)
)
ELSE (s.fix_costs / DAY(LAST_DAY(sales_date))) * 1 / SUM(country <> 'None') OVER (PARTITION by sales_date)
END) AS imputed_fix_costs
FROM sales s
WHERE country <> 'None'
GROUP BY 1,2,3
ORDER BY 1;
However, in PostgresSQL I get an error on DAY(LAST_DAY(sales_date)).
I tried to replace this part with (date_part('DAY', ((date_trunc('MONTH', s.sales_date) + INTERVAL '1 MONTH - 1 DAY')::date)))
However, this is causing another error.
How do I need to modify the query to get the expected result?
The Postgresql equivalent of DAY(LAST_DAY(sales_date)) would be:
extract(day from (date_trunc('month', sales_date + interval '1 month') - interval '1 day'))
The expression SUM(country <> 'None') also needs to be fixed as
SUM(case when country <> 'None' then 1 else 0 end)
It might be a good idea to define this compatibility function:
create function last_day(d date) returns date as
$$
select date_trunc('month', d + interval '1 month') - interval '1 day';
$$ language sql immutable;
Then the first expression becomes simply
extract(day from last_day(sales_date))
I would create a function to return the last day (number) for a given date - which is actually the "length" of the month.
create function month_length(p_input date)
returns integer
as
$$
select extract(day from (date_trunc('month', p_input) + interval '1 month - 1 day'));
$$
language sql
immutable;
Then the query can be written as:
select sales_date, country,
sum(sales_volume),
sum(fix_costs_per_day * cost_factor)
from (
select id, country, sales_date, sales_volume, fix_costs,
fix_costs / month_length(sales_date) as fix_costs_per_day,
case
when sum(sales_volume) over (partition by sales_date) > 0
then sales_volume::numeric / sum(sales_volume) over (partition by sales_date)
else sales_volume::numeric / 2
end as cost_factor
from sales
where country <> 'None'
) t
group by sales_date, country
order by sales_date, country
I'm using PostgreSQL 9.5
and I have a table like this:
CREATE TABLE tracks (
track bigserial NOT NULL,
time_track timestamp,
CONSTRAINT pk_aircraft_tracks PRIMARY KEY ( track )
);
I want to obtain track for the closest value of datetime by SELECT operator.
e.g, if I have:
track datatime
1 | 2016-12-01 21:02:47
2 | 2016-11-01 21:02:47
3 |2016-12-01 22:02:47
For input datatime 2016-12-01 21:00, the track is 2.
I foud out Is there a postgres CLOSEST operator? similar queston for integer.
But it is not working with datatime or PostgreSQL 9.5 :
SELECT * FROM
(
(SELECT time_track, track FROM tracks WHERE time_track >= now() ORDER BY time_track LIMIT 1) AS above
UNION ALL
(SELECT time_track, track FROM tracks WHERE time_track < now() ORDER BY time_track DESC LIMIT 1) AS below
)
ORDER BY abs(?-time_track) LIMIT 1;
The error:
ERROR: syntax error at or near "UNION"
LINE 4: UNION ALL
Track 1 is the closest to '2016-12-01 21:00':
with tracks(track, datatime) as (
values
(1, '2016-12-01 21:02:47'::timestamp),
(2, '2016-11-01 21:02:47'),
(3, '2016-12-01 22:02:47')
)
select *
from tracks
order by
case when datatime > '2016-12-01 21:00' then datatime - '2016-12-01 21:00'
else '2016-12-01 21:00' - datatime end
limit 1;
track | datatime
-------+---------------------
1 | 2016-12-01 21:02:47
(1 row)
I would like to insert time period into table column. For example: I've got a table with 7 columns, each column is a day of a week. Is there a possibility to create a datatype that is a time period for employees work hours? Let's say from 1AM to 8AM. Or in 24h system.
If there is not, how should i deal with it?
If you're building a table of something like business hours, you're probably better off with fewer columns.
create table business_hours (
day_of_week char(3) not null unique
check (day_of_week in ('Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun')),
start_time time not null,
end_time time not null
check (end_time > start_time)
);
insert into business_hours values
('Mon', '09:00', '17:00'),
('Tue', '09:00', '17:00'),
('Wed', '09:00', '17:00'),
('Thu', '09:00', '17:00'),
('Fri', '09:00', '17:00'),
('Sat', '11:00', '15:00');
You can join that table with a calendar table (or create a calendar table on the fly with generate_series()) to produce the business hours for the current week.
select c.cal_date, bh.*
from calendar c
inner join business_hours bh on bh.day_of_week = c.day_of_week
where cal_date between '2013-01-20' and '2013-01-27'
order by cal_date
Arranging that data into a matrix is a presentation-level issue. Use application code to do that.
The simplest calendar table you can use for this kind of query has just two columns. (Mine uses English. Adjust abbreviations as you like, but they must match the abbreviations in the table "business_hours".)
CREATE TABLE calendar
(
cal_date date NOT NULL,
day_of_week character(3) NOT NULL,
CONSTRAINT cal_pkey PRIMARY KEY (cal_date),
CONSTRAINT cal_dow_values CHECK (day_of_week =
CASE
WHEN date_part('dow', cal_date) = 0 THEN 'Sun'
WHEN date_part('dow', cal_date) = 1 THEN 'Mon'
WHEN date_part('dow', cal_date) = 2 THEN 'Tue'
WHEN date_part('dow', cal_date) = 3 THEN 'Wed'
WHEN date_part('dow', cal_date) = 4 THEN 'Thu'
WHEN date_part('dow', cal_date) = 5 THEN 'Fri'
WHEN date_part('dow', cal_date) = 6 THEN 'Sat'
ELSE NULL
END)
);
CREATE INDEX ON calendar (day_of_week);
There are a lot of different ways to populate a calendar table--spreadsheet, PostgreSQL function, scripting language to generate a CSV file, etc.
Does anyone know of a simple method for solving this?
I have a table which consists of start times for events and the associated durations. I need to be able to split the event durations into thirty minute intervals. So for example if an event starts at 10:45:00 and the duration is 00:17:00 then the returned set should allocate 15 minutes to the 10:30:00 interval and 00:02:00 minutes to the 11:00:00 interval.
I'm sure I can figure out a clumsy approach but would like something a little simpler. This must come up quite often I'd imagine but Google is being unhelpful today.
Thanks,
Steve
You could create a lookup table with just the times (over 24 hours), and join to that table. You would need to rebase the date to that used in the lookup. Then perform a datediff on the upper and lower intervals to work out their durations. Each middle interval would be 30 minutes.
create table #interval_lookup (
from_date datetime,
to_date datetime
)
declare #time datetime
set #time = '00:00:00'
while #time < '2 Jan 1900'
begin
insert into #interval_lookup values (#time, dateadd(minute, 30, #time))
set #time = dateadd(minute, 30, #time)
end
declare #search_from datetime
declare #search_to datetime
set #search_from = '10:45:00'
set #search_to = dateadd(minute, 17, #search_from)
select
from_date as interval,
case
when from_date <= #search_from and
#search_from < to_date and
from_date <= #search_to and
#search_to < to_date
then datediff(minute, #search_from, #search_to)
when from_date <= #search_from and
#search_from < to_date
then datediff(minute, #search_from, to_date)
when from_date <= #search_to and
#search_to < to_date then
datediff(minute, from_date, #search_to)
else 30
end as duration
from
#interval_lookup
where
to_date > #search_from
and from_date <= #search_to
Create TVF that splits single event:
ALTER FUNCTION dbo.TVF_TimeRange_Split_To_Grid
(
#eventStartTime datetime
, #eventDurationMins float
, #intervalMins int
)
RETURNS #retTable table
(
intervalStartTime datetime
,intervalEndTime datetime
,eventDurationInIntervalMins float
)
AS
BEGIN
declare #eventMinuteOfDay int
set #eventMinuteOfDay = datepart(hour,#eventStartTime)*60+datepart(minute,#eventStartTime)
declare #intervalStartMinute int
set #intervalStartMinute = #eventMinuteOfDay - #eventMinuteOfDay % #intervalMins
declare #intervalStartTime datetime
set #intervalStartTime = dateadd(minute,#intervalStartMinute,cast(floor(cast(#eventStartTime as float)) as datetime))
declare #intervalEndTime datetime
set #intervalEndTime = dateadd(minute,#intervalMins,#intervalStartTime)
declare #eventDurationInIntervalMins float
while (#eventDurationMins>0)
begin
set #eventDurationInIntervalMins = cast(#intervalEndTime-#eventStartTime as float)*24*60
if #eventDurationMins<#eventDurationInIntervalMins
set #eventDurationInIntervalMins = #eventDurationMins
insert into #retTable
select #intervalStartTime,#intervalEndTime,#eventDurationInIntervalMins
set #eventDurationMins = #eventDurationMins - #eventDurationInIntervalMins
set #eventStartTime = #intervalEndTime
set #intervalStartTime = #intervalEndTime
set #intervalEndTime = dateadd(minute,#intervalMins,#intervalEndTime)
end
RETURN
END
GO
Test:
select getdate()
select * from dbo.TVF_TimeRange_Split_To_Grid(getdate(),23,30)
Test Result:
2008-10-31 11:28:12.377
intervalStartTime intervalEndTime eventDurationInIntervalMins
----------------------- ----------------------- ---------------------------
2008-10-31 11:00:00.000 2008-10-31 11:30:00.000 1,79372222222222
2008-10-31 11:30:00.000 2008-10-31 12:00:00.000 21,2062777777778
Sample usage:
select input.eventName, result.* from
(
select
'first' as eventName
,cast('2008-10-03 10:45' as datetime) as startTime
,17 as durationMins
union all
select
'second' as eventName
,cast('2008-10-05 11:00' as datetime) as startTime
,17 as durationMins
union all
select
'third' as eventName
,cast('2008-10-05 12:00' as datetime) as startTime
,100 as durationMins
) input
cross apply dbo.TVF_TimeRange_Split_To_Grid(input.startTime,input.durationMins,30) result
Sample usage result:
eventName intervalStartTime intervalEndTime eventDurationInIntervalMins
--------- ----------------------- ----------------------- ---------------------------
first 2008-10-03 10:30:00.000 2008-10-03 11:00:00.000 15
first 2008-10-03 11:00:00.000 2008-10-03 11:30:00.000 2
second 2008-10-05 11:00:00.000 2008-10-05 11:30:00.000 17
third 2008-10-05 12:00:00.000 2008-10-05 12:30:00.000 30
third 2008-10-05 12:30:00.000 2008-10-05 13:00:00.000 30
third 2008-10-05 13:00:00.000 2008-10-05 13:30:00.000 30
third 2008-10-05 13:30:00.000 2008-10-05 14:00:00.000 10
(7 row(s) affected)