Postgres generate_series wrong result - postgresql

I don't know if it's a bug or it's intended, but the simple query
select generate_series('2021-03-28'::date, '2021-03-29'::date - interval '1 minute', interval '1 hour') as date_hour;
> 2021-03-28 00:00:00.000
2021-03-28 01:00:00.000
2021-03-28 03:00:00.000 <---
2021-03-28 03:00:00.000 <---
2021-03-28 04:00:00.000
2021-03-28 05:00:00.000
2021-03-28 06:00:00.000
2021-03-28 07:00:00.000
2021-03-28 08:00:00.000
2021-03-28 09:00:00.000
2021-03-28 10:00:00.000
2021-03-28 11:00:00.000
2021-03-28 12:00:00.000
2021-03-28 13:00:00.000
2021-03-28 14:00:00.000
2021-03-28 15:00:00.000
2021-03-28 16:00:00.000
2021-03-28 17:00:00.000
2021-03-28 18:00:00.000
2021-03-28 19:00:00.000
2021-03-28 20:00:00.000
2021-03-28 21:00:00.000
2021-03-28 22:00:00.000
2021-03-28 23:00:00.000
generates a duplicate row 2021-03-28 03:00:00.000, as you can also see in the picture below.
What I'm missing here?
select version();
> PostgreSQL 14.4 (Ubuntu 14.4-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit
UPDATE
Despite the result, under the scene it looks like the values are correctly different. I wonder if it's just a representation issue.
select date_trunc('hour', date_hour), extract(epoch from date_hour), count(*)
from (
select generate_series('2021-03-28'::date, '2021-03-29'::date - interval '1 minute', interval '1 hour') as date_hour
) s
group by date_trunc('hour', date_hour), extract(epoch from date_hour)
order by date_trunc('hour', date_hour), extract(epoch from date_hour)

Related

How can I get weekly sales for every salesman

I have a table like below (tablename: sales)
sales_datetime
sales
salesman
2022-08-01 09:00:00
100
John
2022-08-01 11:00:00
200
John
2022-08-02 10:00:00
100
Peter
2022-08-02 13:00:00
300
John
2022-08-04 14:00:00
300
Peter
2022-08-05 12:00:00
100
John
2022-08-05 16:00:00
200
John
From that table I want to make a summary sales for 5 days period for each salesman. So the summary table that I want is look like this
periode
total_sales
salesman
2022-08-01
300
John
2022-08-01
0
Peter
2022-08-02
300
John
2022-08-02
100
Peter
2022-08-03
0
John
2022-08-03
0
Peter
2022-08-04
0
John
2022-08-04
300
Peter
2022-08-05
300
John
2022-08-05
0
Peter
I have created following query (PSQL) but the results were not same as I want. Assume today is 2022-08-05
with dateseries as
(select generate_series(current_date-'4 days'::interval,
current_date::date,
'1 day'::interval)::date as periode)
select d.periode,coalesce(sum(s.sales),0) as total_sales,s.salesman from dateseries d
left outer join sales s
on d.periode=s.sales_datetime::date
group by d.periode, s.salesman order by d.periode
results:
periode
total_sales
salesman
2022-08-01
300
John
2022-08-02
300
John
2022-08-02
100
Peter
2022-08-03
0
(NULL)
2022-08-04
300
Peter
2022-08-05
300
John
Any advices would be so great. Thank you
Step by step first aggregate the daily sales per salesperson (aggregated_sales CTE), create a list of days to report (days CTE), create a list of salesmen (salesmen CTE) and then query the sales for each day/salesman pair.
with aggregated_sales as
(
select sales_datetime::date sales_date, sum(sales) sales, salesman
from sales group by sales_datetime::date, salesman
),
days(sales_date) as
(
select d::date
from generate_series('2022-08-01', '2022-08-08', interval '1 day') d
),
salesmen (salesman) as
(
select distinct salesman from sales
)
select sales_date, coalesce(sales, 0) sales, salesman
from (select * from days cross join salesmen) fl
left outer join aggregated_sales ags using (sales_date, salesman);
The query may be shorter if CTEs are inlined yet I think that clarity and readability are more important than mere size.
In order to "make a summary sales for 5 days period for each salesman" replace generate_series('2022-08-01', '2022-08-08', interval '1 day') with generate_series(current_date - 4, current_date, interval '1 day').
the results were not same as I want. Assume today is 2022-08-05
Please note that '2022-08-05'::date - '5 days'::interval will give you 2022-07-31, and not 2022-08-01 as you assume. Because of that, I think you meant it to be current_date - '4 days'::interval.
With that out of the way, here is one possible query:
with sales_by_date as (
select
salesman,
sales_datetime::date,
sum(sales) total_sales
from sales
where
-- assuming you need to have totals for salesmen that had sales in specified period only
sales_datetime::date between current_date-'4 days'::interval and current_date
group by
salesman,
sales_datetime::date),
dateseries as (
select
distinct salesman,
generate_series(current_date-'4 days'::interval, current_date, '1 day'::interval)::date as periode
from sales_by_date)
select
d.periode,
coalesce(s.total_sales, 0) total_sales,
d.salesman
from dateseries d
left join sales_by_date s
on d.periode = s.sales_datetime
and d.salesman = s.salesman
order by d.periode, d.salesman;
But you still have to figure out some requirements for this problem. E.g. what if for the specified period there are no sales at all in the sales table?

ORACLE "connect by" to POSTGRESQL conversion

How to convert/re-write the below SQL to PostgreSQL? I'm not sure how to convert the connect by level and listagg in postgreSQL.
select listagg(dt,' ') within group (order by lvl), NBR
from
(
select level lvl,
case when level=1 then TO_CHAR(a.dt2,'MM/DD/YYYY HH24:MI:SS')
else
TO_CHAR(a.dt2+(1/1440*30*(level-1)),'MM/DD/YYYY HH24:MI:SS')
end
dt,10 NBR from
(select to_date('08/11/2021 18:30:00','MM/DD/YYYY HH24:MI:SS') dt1,to_date('08/11/2021 01:30:00','MM/DD/YYYY HH24:MI:SS') dt2 from dual) a
start with level=0
connect by level<=1+(to_date('08/11/2021 18:30:00','MM/DD/YYYY HH24:MI:SS')-to_date('08/11/2021 13:30:00','MM/DD/YYYY HH24:MI:SS'))*24*2)
GROUP BY NBR;
Output:
08/11/2021 01:30:00
08/11/2021 02:00:00
08/11/2021 02:30:00
08/11/2021 03:00:00
08/11/2021 03:30:00
08/11/2021 04:00:00
08/11/2021 04:30:00
08/11/2021 05:00:00
08/11/2021 05:30:00
08/11/2021 06:00:00
08/11/2021 06:30:00
Not sure what the group by nbr is supposed to achieve - as far as I can tell this serves no purpose.
The convoluted connect by level in Oracle can be replaced with a simple generate_series() in Postgres.
So the following will generate 11 timestamp values from 2021-08-11 01:30:00 to 2021-08-11 06:30:00:
select g.dt
from generate_series(timestamp '2021-08-11 01:30:00',
timestamp '2021-08-11 06:30:00',
interval '30 minute') as g(dt)
This can then be aggregated back into a string using string_agg()
select string_agg(to_char(dt, 'dd/mm/yyyy hh24:mi:ss'), ' '), 10 as nbr
from generate_series(timestamp '2021-08-11 01:30:00',
timestamp '2021-08-11 06:30:00',
interval '30 minute') as g(dt)
If you need the number of rows generated, you can use the with ordinality clause to get that:
select string_agg(to_char(dt, 'dd/mm/yyyy hh24:mi:ss'), ' '), max(idx) as nbr
from generate_series(timestamp '2021-08-11 01:30:00',
timestamp '2021-08-11 06:30:00',
interval '30 minute') with ordinality as g(dt,idx)

Addition of timestamp is not working in redshift

select CAST(current_date as timestamp)
+ to_char(timestamp '2019-07-08 09:00:00','YYYY-MM-DD HH24:MI:SS')::TIMESTAMP;
Current output: 2019-08-02 00:00:002019-07-08 09:00:00
You can use the DATEADD() function or the INTERVAL syntax to add specific units of time to another date/timestamp.
SELECT DATEADD(hour, 9, current_date);
-- date_add
-- ---------------------
-- 2019-08-05 09:00:00
SELECT current_date + INTERVAL '9 hours';
-- ?column?
-- ---------------------
-- 2019-08-05 09:00:00

how to get day difference in postgres

I want to find out how many days are left until "End_date" is reached in postgres. What will be equivalent for following in postgres?
Days_Left = Column in table - Today's date
GREATEST(INT4(CEIL(("NUMERIC"(DATE_PART('EPOCH'::"VARCHAR", (T1.End_date - "TIMESTAMP"(DATE('now'::"VARCHAR"))))) / '86400'::"NUMERIC"))), 0) AS DAYS_LEFT
--Thanks I tried your suggestion but did not get expected result.
Expected Result -- if use GREATEST(INT4(CEIL(("NUMERIC"(DATE_PART('EPOCH'::"VARCHAR", (CA.END_DATE - "TIMESTAMP"(DATE('now'::"VARCHAR"))))) / '86400'::"NUMERIC"))), 0)
End_date Days_left
2014-11-01 03:59:00 47
2016-01-01 04:59:59 473
2017-01-01 06:59:59 839
2014-12-31 22:59:00 107
Result - date(end_date) - date(current_date)
End_date Days_Left
2014-11-01 03:59:00 46
2016-01-01 04:59:59 472
2017-01-01 06:59:59 838
2014-12-31 22:59:00 106
Result - if use (end_date - current_date)
End_date Days_Left
2014-11-01 03:59:00 46 days 03:59
2016-01-01 04:59:59 472 days 04:59:59
2017-01-01 06:59:59 838 days 06:59:59
2014-12-31 22:59:00 106 days 22:59
Thanks
Sandy
If column_in_table is defined as a DATE you can use this:
select column_in_table - current_date as days_left
from the_table
Edit
As end_date is a timestamp the above expression will return an interval not an integer.
If you don't care about the hours and minutes left, casting the timestamp to a date should work:
select end_date::date - current_date as days_left
from the_table

Copy shifts from leap year to non-leap year

I need to copy all the shifts from 2012 to 2013 using T-SQL 2008 R2. There are 3 shifts per day. Start date and shift date are always same. end date (for shift c) is the next day.
As you can see, if I just used dateadd(year, 1, Col), I get 2 sets of records for 2013-02-28. The rows 4, 6 and 8 shouldn't be there (and will cause PK violations). row 8 is wrong as the end time for shift C should be previous calendar day.
I have 67,000-ish rows in total to copy
Only thing I can think of off top of my head is insert to temp table and then somehow identify dupes/incorrect records, delete and then insert back into shifts table. I'm sure there must be a better way
Anyone got a cunning plan?
I'd like to create a general purpose Stored procedure that can copy leap year to non-leap year and vice versa
Regards
Mark
Maybe try a DISTINCT list combined with a WHERE End > Start, as in this simplified example:
CREATE TABLE Shifts(ShiftCode CHAR, ShiftStart DATETIME, ShiftEnd DATETIME);
GO
INSERT Shifts
VALUES('A','2/26/2012 07:00:00','2/26/2012 15:00:00')
, ('B','2/26/2012 15:00:00','2/26/2012 23:00:00')
, ('C','2/26/2012 23:00:00','2/27/2012 07:00:00')
, ('A','2/27/2012 07:00:00','2/27/2012 15:00:00')
, ('B','2/27/2012 15:00:00','2/27/2012 23:00:00')
, ('C','2/27/2012 23:00:00','2/28/2012 07:00:00')
, ('A','2/28/2012 07:00:00','2/28/2012 15:00:00')
, ('B','2/28/2012 15:00:00','2/28/2012 23:00:00')
, ('C','2/28/2012 23:00:00','2/29/2012 07:00:00')
, ('A','2/29/2012 07:00:00','2/29/2012 15:00:00')
, ('B','2/29/2012 15:00:00','2/29/2012 23:00:00')
, ('C','2/29/2012 23:00:00','3/1/2012 07:00:00')
, ('A','3/1/2012 07:00:00','3/1/2012 15:00:00')
, ('B','3/1/2012 15:00:00','3/1/2012 23:00:00')
, ('C','3/1/2012 23:00:00','3/2/2012 07:00:00');
GO
SELECT DISTINCT ShiftCode
, ShiftStart = DATEADD(YYYY,1,ShiftStart)
, ShiftEnd = DATEADD(YYYY,1,ShiftEnd)
FROM Shifts
WHERE DATEADD(YYYY,1,ShiftEnd) > DATEADD(YYYY,1,ShiftStart)
ORDER BY DATEADD(YYYY,1,ShiftStart), ShiftCode
GO
Result:
A 2013-02-26 07:00:00.000 2013-02-26 15:00:00.000
B 2013-02-26 15:00:00.000 2013-02-26 23:00:00.000
C 2013-02-26 23:00:00.000 2013-02-27 07:00:00.000
A 2013-02-27 07:00:00.000 2013-02-27 15:00:00.000
B 2013-02-27 15:00:00.000 2013-02-27 23:00:00.000
C 2013-02-27 23:00:00.000 2013-02-28 07:00:00.000
A 2013-02-28 07:00:00.000 2013-02-28 15:00:00.000
B 2013-02-28 15:00:00.000 2013-02-28 23:00:00.000
C 2013-02-28 23:00:00.000 2013-03-01 07:00:00.000
A 2013-03-01 07:00:00.000 2013-03-01 15:00:00.000
B 2013-03-01 15:00:00.000 2013-03-01 23:00:00.000
C 2013-03-01 23:00:00.000 2013-03-02 07:00:00.000
I figured it out BUT then found some resources were missing shifts for 2012
Ended up creating with tally table and just doing fresh inserts for every shift for the year
SELECT
rh.PlanPressID
,DATEADD(hh,(24 / #NoOfShifts) * (t.N - 1),#StartDateTime) AS ShiftStart
,DATEADD(hh,(24 / #NoOfShifts) * (t.N),#StartDateTime) AS ShiftEnd
,CHAR((t.N - 1) % #NoOfShifts + 65) AS ShiftCode
,DATEADD(dd,0,DATEDIFF(dd,0,DATEADD(hh,(24 / #NoOfShifts) * (t.N - 1),#StartDateTime))) AS ShiftDate
,0 AS Personnel
FROM
dbo.Tally t
CROSS JOIN dbo.ResourceHeader AS rh