postgres sql bucket values into generated time sequence - postgresql

I am trying to transform data from a table of recorded events. I am transforming the data into a consistent 'daily half hour view'. e.g 48 half periods (padding out half hours with zero when there are no matching events), i have completed this with partial success.
SELECT t1.generate_series,
v1.begin_time,
v1.end_time,
v1.volume
FROM tbl_my_values v1
RIGHT JOIN ( SELECT generate_series.generate_series
FROM generate_series((to_char(now(), 'YYYY-MM-dd'::text) || ' 22:00'::text)::timestamp without time zone,
(to_char(now() + '1 day'::interval, 'YYYY-MM-dd'::text) || ' 22:00'::text)::timestamp without time zone, '00:30:00'::interval)
generate_series(generate_series)) t1 ON t1.generate_series = v1.begin_time
order by 1 ;
This provides the following results:
2015-12-19 22:00:00 | 2015-12-19 22:00:00+00 | 2015-12-19 23:00:00+00 | 172.10
2015-12-19 22:30:00 | | |
2015-12-19 23:00:00 | 2015-12-19 23:00:00+00 | 2015-12-20 00:00:00+00 | 243.60
2015-12-20 00:30:00 | | |
2015-12-20 01:00:00 | | |
However based on the 'start' and 'end' columns the view should be:
2015-12-19 22:00:00 | 2015-12-19 22:00:00+00 | 2015-12-19 23:00:00+00 | 172.10
2015-12-19 22:30:00 | | | 172.10
2015-12-19 23:00:00 | 2015-12-19 23:00:00+00 | 2015-12-20 00:00:00+00 | 243.60
2015-12-20 00:30:00 | | | 243.60
2015-12-20 01:00:00 | | |
because the the values in this example span 2 half hours e.g. are valid for one hour.
All help is very welcome. Thanks

Your ON clause is only comparing to the begin_time. I think you want an inequality:
on t1.generate_series between v1.begin_time and t1.end_time

Related

Postgres max value per hour with time it occurred

Given a Postgres table with columns highwater_datetime::timestamp and highwater::integer, I am trying to construct a select statement for a given highwater_datetime range, that generates rows with a column for the max highwater for each hour (first occurrence when dups) and another column showing the highwater_datetime when it occurred (truncated to the minute and order by highwater_datetime asc). e.g.
| highwater_datetime | max_highwater |
+--------------------+---------------+
| 2021-01-27 20:05 | 8 |
| 2021-01-27 21:00 | 7 |
| 2021-01-27 22:00 | 7 |
| 2021-01-27 23:00 | 7 |
| 2021-01-28 00:00 | 7 |
| 2021-01-28 01:32 | 7 |
| 2021-01-28 02:00 | 7 |
| 2021-01-28 03:00 | 7 |
| 2021-01-28 04:22 | 9 |
DISTINCT ON should do the trick:
SELECT DISTINCT ON (date_trunc('hour', highwater_datetime))
highwater_datetime,
highwater
FROM mytable
ORDER BY date_trunc('hour', highwater_datetime),
highwater DESC,
highwater_datetime;
DISTINCT ON will output the first row for each entry with the same hour according to the ORDER BY clause.

how to retrieve information from three tables in below conditions in postgresql

I have three tables.
TABLE_1:
T2_ID ver date boolean
---------------------------------------------------------
1 | X-20-50 | 2019-01-01 16:20:51.722336+00 | TRUE
2 | X-50-30 | 2019-02-26 16:20:51.722336+00 | TRUE
3 | X-20-32 | 2019-03-20 16:20:51.722336+00 | FALSE
1 | X-20-50 | 2019-01-09 16:20:51.722336+00 | FALSE
2 | X-20-50 | 2019-12-02 16:20:51.722336+00 | TRUE
3 | X-20-50 | 2019-01-24 16:20:51.722336+00 | TRUE
TABLE_2:
id | type | scheduler
--------------------------------------------------
1 | ABC | w1,w2,w3,w4,w5,w6,w7,w8,w9,w10,w11,w12
2 | PQR | w5,w9
3 | TRC | w1,w4,w8
TABLE_3
start_date_of_ver | end_date_of_ver | ver_name
-----------------------------------------------------------
2019-01-01 00:00:00+00 | 2019-04-01 00:00:00+00 | X-20-50
2019-02-25 00:00:00+00 | 2019-05-26 00:00:00+00 | X-50-30
2019-03-15 00:00:00+00 | 2019-06-06 00:00:00+00 | X-20-32
Table 4 should fulfill the below condition.
it takes version name (ver_name) as input
from this (ver_name), it takes start date and end date of version (from table_3) if the version period is 3 months then it creates 12 weeks table with id (type) as the first column and creates an entry of twelve-week according to table 2 of the scheduler.
information on table 4 will be updated as and when table 1 has entries of that particular week which are TRUE
Note: table 1, entries get generates on a daily basis.
Desired table: which has only ver_name as input and calculate below table.
When table_1 don't have any entries then table_4 should look like as below
Table_4: X-20-50
id_of_table_2 | week_1 | week_2 | week_3 | week_4 | week_5 | week_6 | week_7 | week_8 | week_9 | week_10 | week_11 | week_12 |
------------------------------------------------------------------------------------------------------------------------------
ABC | w1 | w2 | w3 | w4 | w5 | w6 | w7 | w8 | w9 | w10 | w11 | w12 |
PQR | | | | | w5 | | | | w9 | | | |
TRC | w1 | | | w4 | | | | w8 | | | | |
When table_1 has entries then table_4 should look like as below
X-20-50
id_of_table_2 | week_1 | week_2 | week_3 | week_4 | week_5 | week_6 | week_7 | week_8 | week_9 | week_10 | week_11 | week_12 |
------------------------------------------------------------------------------------------------------------------------------
ABC | Done | Done | w3 | w4 | w5 | w6 | w7 | w8 | w9 | w10 | w11 | w12 |
PQR | | | | | w5 | | | | w9 | | | |
TRC | Done | | | w4 | | | | w8 | | | | |
You can create function which can take starting date of a week as input.
Example-
create function a(start_date)
RETURNS json
LANGUAGE 'plpgsql'
COST 100
VOLATILE
AS $BODY$
DECLARE
outputjson json;
BEGIN
EXECUTE 'select json_agg(*) from table_name where date >= '||start_date||' and (date '||start_date||' + integer ''7'')' into outputjson;
RETURN outputjson;
END;
$$
Hope this will help.
Your requirement needs a little refinement. You specify to retrieve weekly data yet fail to define a your week. On what day does it begin? Are all weeks 7 days long? What happens when Dec 31 falls on Tuesday is Friday Jan 3 in the same week (see current year calendar). Then there is the issue of user input and what it represents. Is it the desired start date and the week is that date and the next 6 days or any date within weekly period?
The following assumes an ISO 8601 definition (google it - lots of stuff). Every week begins on Monday and all weeks are 7 days long. (Thus the week containing 31-Dec-2019 also includes 3-Jan-2020). The routine extracts the ISO Year and ISO week user entered date.
--setup
create table weekly_something( c1 text, c2 text, date1 timestamptz, someem boolean);
insert into weekly_something( c1, c2, date1, someem )
values ('ABC','AB-20-50','2019-11-25 16:20:51.722336+00',TRUE)
, ('PQR','AB-50-30','2019-11-26 16:20:51.722336+00',TRUE)
, ('TRC','CD-20-32','2019-11-27 16:20:51.722336+00',FALSE)
, ('ABC','AB-20-50','2019-12-02 16:20:51.722336+00',FALSE)
, ('ABC','AB-20-50','2019-12-02 16:20:51.722336+00',TRUE)
, ('JFF','yy-45-89','2019-12-31 16:20:51.722336+00',TRUE)
, ('JFF','yy-89-30','2020-01-03 16:20:51.722336+00',TRUE) ;
-- JFF Just For Fun
-- SQL Function
create function week_of(week_date date)
returns setof weekly_something
language sql stable strict
as $$
select *
from weekly_something
where (extract('isoyear' from week_date), extract('week' from week_date)) =
(extract('isoyear' from date1), extract('week' from date1));
$$;
-- test
select * from week_of('2019-11-26');
select * from week_of('2019-12-30');

Sumif in Postgresql between two tables

These are my two sample tables.
table "outage" (column formats are text, timestamp, timestamp)
+-------------------+----------------+----------------+
| outage_request_id | actual_start | actual_end |
+-------------------+----------------+----------------+
| 1-07697685 | 4/8/2015 4:48 | 4/8/2015 9:02 |
| 1-07223444 | 7/17/2015 4:24 | 8/01/2015 9:23 |
| 1-07223450 | 2/13/2015 4:24 | 4/29/2015 1:03 |
| 1-07223669 | 4/28/2017 9:20 | 4/30/2017 6:58 |
| 1-08985319 | 8/24/2015 3:18 | 8/24/2015 8:27 |
+-------------------+----------------+----------------+
and a second table "prices" (column format is numeric, timestamp)
+-------+---------------+
| price | stamp |
+-------+---------------+
| -2.31 | 2/1/2018 3:00 |
| -2.35 | 2/1/2018 4:00 |
| -1.77 | 2/1/2018 5:00 |
| -2.96 | 2/1/2018 6:00 |
| -5.14 | 2/1/2018 7:00 |
+-------+---------------+
My Goal: To sum the prices in between the start and stop times of each outage_request_id.
I have no idea how to go about properly joining the tables and getting a sum of prices in those outage timestamp ranges.
I can't promise this is efficient (in fact for very large tables I feel pretty confident it's not), but this should notionally get you what you want:
select
o.outage_request_id, o.actual_start, o.actual_end,
sum (p.price) as total_price
from
outage o
left join prices p on
p.stamp between o.actual_start and o.actual_end
group by
o.outage_request_id, o.actual_start, o.actual_end

PostgreSQL Crosstab generate_series of weeks for columns

From a table of "time entries" I'm trying to create a report of weekly totals for each user.
Sample of the table:
+-----+---------+-------------------------+--------------+
| id | user_id | start_time | hours_worked |
+-----+---------+-------------------------+--------------+
| 997 | 6 | 2018-01-01 03:05:00 UTC | 1.0 |
| 996 | 6 | 2017-12-01 05:05:00 UTC | 1.0 |
| 998 | 6 | 2017-12-01 05:05:00 UTC | 1.5 |
| 999 | 20 | 2017-11-15 19:00:00 UTC | 1.0 |
| 995 | 6 | 2017-11-11 20:47:42 UTC | 0.04 |
+-----+---------+-------------------------+--------------+
Right now I can run the following and basically get what I need
SELECT COALESCE(SUM(time_entries.hours_worked),0) AS total,
time_entries.user_id,
week::date
--Using generate_series here to account for weeks with no time entries when
--doing the join
FROM generate_series( (DATE_TRUNC('week', '2017-11-01 00:00:00'::date)),
(DATE_TRUNC('week', '2017-12-31 23:59:59.999999'::date)),
interval '7 day') as week LEFT JOIN time_entries
ON DATE_TRUNC('week', time_entries.start_time) = week
GROUP BY week, time_entries.user_id
ORDER BY week
This will return
+-------+---------+------------+
| total | user_id | week |
+-------+---------+------------+
| 14.08 | 5 | 2017-10-30 |
| 21.92 | 6 | 2017-10-30 |
| 10.92 | 7 | 2017-10-30 |
| 14.26 | 8 | 2017-10-30 |
| 14.78 | 10 | 2017-10-30 |
| 14.08 | 13 | 2017-10-30 |
| 15.83 | 15 | 2017-10-30 |
| 8.75 | 5 | 2017-11-06 |
| 10.53 | 6 | 2017-11-06 |
| 13.73 | 7 | 2017-11-06 |
| 14.26 | 8 | 2017-11-06 |
| 19.45 | 10 | 2017-11-06 |
| 15.95 | 13 | 2017-11-06 |
| 14.16 | 15 | 2017-11-06 |
| 1.00 | 20 | 2017-11-13 |
| 0 | | 2017-11-20 |
| 2.50 | 6 | 2017-11-27 |
| 0 | | 2017-12-04 |
| 0 | | 2017-12-11 |
| 0 | | 2017-12-18 |
| 0 | | 2017-12-25 |
+-------+---------+------------+
However, this is difficult to parse particularly when there's no data for a week. What I would like is a pivot or crosstab table where the weeks are the columns and the rows are the users. And to include nulls from each (for instance if a user had no entries in that week or week without entries from any user).
Something like this
+---------+---------------+--------------+--------------+
| user_id | 2017-10-30 | 2017-11-06 | 2017-11-13 |
+---------+---------------+--------------+--------------+
| 6 | 4.0 | 1.0 | 0 |
| 7 | 4.0 | 1.0 | 0 |
| 8 | 4.0 | 0 | 0 |
| 9 | 0 | 1.0 | 0 |
| 10 | 4.0 | 0.04 | 0 |
+---------+---------------+--------------+--------------+
I've been looking around online and it seems that "dynamically" generating a list of columns for crosstab is difficult. I'd rather not hard code them, which seems weird to do anyway for dates. Or use something like this case with week number.
Should I look for another solution besides crosstab? If I could get the series of weeks for each user including all nulls I think that would be good enough. It just seems that right now my join strategy isn't returning that.
Personally I would use a Date Dimension table and use that table as the basis for the query. I find it far easier to use tabular data for these types of calculations as it leads to SQL that's easier to read and maintain. There's a great article on creating a Date Dimension table in PostgreSQL at https://medium.com/#duffn/creating-a-date-dimension-table-in-postgresql-af3f8e2941ac, though you could get away with a much simpler version of this table.
Ultimately what you would do is use the Date table as the base for the SELECT cols FROM table section and then join against that, or probably use Common Table Expressions, to create the calculations.
I'll write up a solution to that if you would like demonstrating how you could create such a query.

Symfony2 Query to find last working date from Holiday Calender

I had a calender entity in my project which manages the open and close time of business day of the whole year.
Below is the record of a specific month
id | today_date | year | month_of_year | day_of_month | is_business_day
-------+---------------------+------+---------------+-------------+---------------+
10103 | 2016-02-01 00:00:00 | 2016 | 2 | 1 | t
10104 | 2016-02-02 00:00:00 | 2016 | 2 | 2 | t
10105 | 2016-02-03 00:00:00 | 2016 | 2 | 3 | t
10106 | 2016-02-04 00:00:00 | 2016 | 2 | 4 | t
10107 | 2016-02-05 00:00:00 | 2016 | 2 | 5 | t
10108 | 2016-02-06 00:00:00 | 2016 | 2 | 6 | f
10109 | 2016-02-07 00:00:00 | 2016 | 2 | 7 | f
10110 | 2016-02-08 00:00:00 | 2016 | 2 | 8 | t
10111 | 2016-02-09 00:00:00 | 2016 | 2 | 9 | t
10112 | 2016-02-10 00:00:00 | 2016 | 2 | 10 | t
10113 | 2016-02-11 00:00:00 | 2016 | 2 | 11 | t
10114 | 2016-02-12 00:00:00 | 2016 | 2 | 12 | t
10115 | 2016-02-13 00:00:00 | 2016 | 2 | 13 | f
10116 | 2016-02-14 00:00:00 | 2016 | 2 | 14 | f
10117 | 2016-02-15 00:00:00 | 2016 | 2 | 15 | t
10118 | 2016-02-16 00:00:00 | 2016 | 2 | 16 | t
10119 | 2016-02-17 00:00:00 | 2016 | 2 | 17 | t
10120 | 2016-02-18 00:00:00 | 2016 | 2 | 18 | t
I want the get the today_date of last 7 working date. Supporse today_date is 2016-02-18 and date of last 7 working dates as 2016-02-09.
You can use row_number() for this like this:
SELECT * FROM
(SELECT t.*,row_number() OVER(order by today_date desc) as rnk
FROM Calender t
WHERE today_date <= current_date
AND is_business_day = 't')
WHERE rnk = 7
This will give you the row of the 7th business day from todays date
I see that you tagged your question with Doctrine, ORM and Datetime. Were you after a QueryBuilder solution? Maybe this is closer to what you want:
$qb->select('c.today_date')
->from(Calendar::class, 'c')
->where("c.today_date <= :today")
->andWhere("c.is_business_day = 't'")
->setMaxResults(7)
->orderBy("c.today_date", "DESC")
->setParameter('today', new \DateTime('now'), \Doctrine\DBAL\Types\Type::DATETIME));