Find if between dates having any date from a list of dates? - postgresql

I am trying to find if (select dates from public_holidays) exist in dates between start_date and end_date.
It will be look like this:
select
id, name, start_date, end_date,
case when (public_holidays = true) then number - 1 else number end as find_real_number
from my_table
Sample Data:
ID Name Start_date End_Date Numbers
1 Mike 3/9/2020 4/9/2020 67
2 Rick 3/1/2020 3/6/2020 34
3 Simm 3/24/2020 3/28/2020 98
4 Lisa 3/27/2020 4/5/2020 103
5 Rosy 3/9/2020 4/9/2020 23
And some sample expected results:
ID Name Start_date End_Date Numbers
1 Mike 3/9/2020 4/9/2020 66
2 Rick 3/1/2020 3/6/2020 34
3 Simm 3/24/2020 3/28/2020 98
4 Lisa 3/27/2020 4/5/2020 102
5 Rosy 3/9/2020 4/9/2020 23
Because we assume the 1st of April is a public holiday, so number of row 1st and 4th got minus by 1.
And sample public holidays view I created:
Public_holidays Dates
April fools 04/01/2020
Labour Day 05/01/2020
Random Day 07/24/2020
However, because I am building query on the Metabase, it does not allow me to create a table. All I did was create a view where has 2 columns that are 'Public Holidays' and 'Dates'
Anyone possibly could give me a suggestion of how to do this? Thanks.

Try something like this:
SELECT id, name, start_date, end_date,
numbers - ( SELECT COUNT(*) FROM holidays
WHERE dates BETWEEN t.start_date AND t.end_date ) AS numbers
FROM my_table AS t
This assumes that your holidays are in a table/view named holidays. Also it counts the holidays between start and end dates and subtract it from numbers of my_table.

I think you want to check public holiday falls or not between start and end Date.
so you should compare dates like below:
select
id, name, start_date, end_date,
case when ((CAST(start_date as date) < CAST(public_holiday_date as date)
and CAST(public_holiday_date as date) < CAST(end_date as date))
then number - 1 else number end as find_real_number
from my_table
or
select
id, name, start_date, end_date,
case when CAST(public_holiday_date as date)
between (CAST(start_date as date) and CAST(end_date as date)
then number - 1 else number end as find_real_number
from my_table

Related

Mixing DISTINCT with GROUP_BY Postgres

I am trying to get a list of:
all months in a specified year that,
have at least 2 unique rows based on their date
and ignore specific column values
where I got to is:
SELECT DATE_PART('month', "orderDate") AS month, count(*)
FROM public."Orders"
WHERE "companyId" = 00001 AND "orderNumber" != 1 and DATE_PART('year', ("orderDate")) = '2020' AND "orderNumber" != NULL
GROUP BY month
HAVING COUNT ("orderDate") > 2
The HAVING_COUNT sort of works in place of DISTINCT insofar as I can be reasonably sure that condition filters the condition of data required.
However, being able to use DISTINCT based on a given date within a month would return a more reliable result. Is this possible with Postgres?
A sample line of data from the table:
Sample Input
"2018-12-17 20:32:00+00"
"2019-02-26 14:38:00+00"
"2020-07-26 10:19:00+00"
"2020-10-13 19:15:00+00"
"2020-10-26 16:42:00+00"
"2020-10-26 19:41:00+00"
"2020-11-19 20:21:00+00"
"2020-11-19 21:22:00+00"
"2020-11-23 21:10:00+00"
"2021-01-02 12:51:00+00"
without the HAVING_COUNT this produces
month
count
7
1
10
2
11
3
Month 7 can be discarded easily as only 1 record.
Month 10 is the issue: we have two records. But from the data above, those records are from the same day. Similarly, month 11 only has 2 distinct records by day.
The output should therefore be ideally:
month
count
11
2
We have only two distinct dates from the 2020 data, and they are from month 11 (November)
I think you just want to take the distinct count of dates for each month:
SELECT
DATE_PART('month', orderDate) AS month,
COUNT(DISTINCT orderDate::date) AS count
FROM Orders
WHERE
companyId = 1 AND
orderNumber != 1 AND
DATE_PART('year', orderDate) = '2020'
GROUP BY
DATE_PART('month', orderDate)
HAVING
COUNT(DISTINCT orderDate::date) > 2;

I need to find the number of users that were invoiced for an amount greater than 0 in the previous month and were not invoiced in the current month

I need to find the number of users that were invoiced for an amount greater than 0 in the previous month and were not invoiced in the current month. This calcualtion is to be done for 12 months in a single query. Output should be as below.
Month Count
01/07/2019 50
01/08/2019 34
01/09/2019 23
01/10/2019 98
01/11/2019 10
01/12/2019 5
01/01/2020 32
01/02/2020 65
01/03/2020 23
01/04/2020 12
01/05/2020 64
01/06/2020 54
01/07/2020 78
I am able to get the value only for one month. I want to get it for all months in a single query.
This is my current query:
SELECT COUNT(DISTINCT TWO_MONTHS_AGO.USER_ID), TWO_MONTHS_AGO.MONTH AS INVOICE_MONTH
FROM (
SELECT USER_ID, LAST_DAY(invoice_ct_dt)) AS MONTH
FROM table a AS ID
WHERE invoice_amt > 0
AND LAST_DAY(invoice_ct_dt)) = ADD_MONTHS(LAST_DAY(CURRENT_DATE - 1), - 2)
GROUP BY user_id
) AS TWO_MONTHS_AGO
LEFT JOIN (
SELECT user_id,LAST_DAY(invoice_ct_dt)) AS MONTH
FROM table a AS ID
AND LAST_DAY(invoice_ct_dt)) = ADD_MONTHS(LAST_DAY(CURRENT_DATE - 1), - 1)
GROUP BY USER_ID
) AS ONE_MONTH_AGO ON TWO_MONTHS_AGO.USER_ID = ONE_MONTH_AGO.USER_ID
WHERE ONE_MONTH_AGO.USER_ID IS NULL
GROUP BY INVOICE_MONTH;
Thank you in advance.
Lona
Probably lots of different approaches but the way I would do it is as follows:
Summarise data by user and month for the last 13 months (you need 12 months plus the previous month to that first month
Compare "this" month (that has data) to "next" month and select records where there is no "next" month data
Summarise this dataset by month and distinct userid
For example, assuming a table created as follows:
create table INVOICE_DATA (
USERID varchar(4),
INVOICE_DT date,
INVOICE_AMT NUMBER(10,2)
);
the following query should give you what you want - you may need to adjust it depending on whether you are including this month, or only up to the end of last month, in your calculation, etc.:
--Summarise data by user and month
WITH MONTH_SUMMARY AS
(
SELECT USERID
,TO_CHAR(INVOICE_DT,'YYYY-MM') "INVOICE_MONTH"
,TO_CHAR(ADD_MONTHS(INVOICE_DT,1),'YYYY-MM') "NEXT_MONTH"
,SUM(INVOICE_AMT) "MONTHLY_TOTAL"
FROM INVOICE_DATA
WHERE INVOICE_DT >= TRUNC(ADD_MONTHS(current_date(),-13),'MONTH') -- Last 13 months of data
GROUP BY 1,2,3
),
--Get data for users with invoices in this month but not the next month
USER_DATA AS
(
SELECT USERID, INVOICE_MONTH, MONTHLY_TOTAL
FROM MONTH_SUMMARY MS_THIS
WHERE NOT EXISTS
(
SELECT USERID
FROM MONTH_SUMMARY MS_NEXT
WHERE
MS_THIS.USERID = MS_NEXT.USERID AND
MS_THIS.NEXT_MONTH = MS_NEXT.INVOICE_MONTH
)
AND MS_THIS.INVOICE_MONTH < TO_CHAR(current_date(),'YYYY-MM') -- Don't include this month as obviously no next month to compare to
)
SELECT INVOICE_MONTH, COUNT(DISTINCT USERID) "USER_COUNT"
FROM USER_DATA
GROUP BY INVOICE_MONTH
ORDER BY INVOICE_MONTH
;

Getting attendance of an employee with a date series in a particular range in Postgres

I have a attendance table with employee_id, date and punch-in time.
Emp_Id PunchTime
101 10/10/2016 07:15
101 10/10/2016 12:20
101 10/10/2016 12:50
101 10/10/2016 16:31
102 10/10/2016 07:15
Here I have the date only for the working days. I want to get the attendance list of a employee with series of given date period. I need the day also. Result should look like as follows
date | day |employee_id | Intime | outtime |
2016-10-09 | sunday | 101 | | |
2016-10-10 | monday | 101 | 2016-10-10 7:15AM |2016-10-10 4:31 PM |
You can generate a list of dates and then do an outer join on them:
The following displays all days in October:
select d.date, a.emp_id,
min(punchtime) as intime,
max(punchtime) as outtime
from generate_series(date '2016-10-01', date '2016-11-01' - 1, interval '1' day) as d (date)
left join attendance a on d.date = a.punchtime::date
group by d.date, a.emp_id;
order by d.date, a.emp_id;
As you want the first and last timestamp from each day this can be done using a simple group by query.
This will however not repeat the emp_id for the non_existing days.
Something like the following will generate a list of the range of dates (starting and ending with whatever range is found in your punchtime table), with employees and intime, outtime for each. Check the SQL fiddle here:
http://sqlfiddle.com/#!15/d93bd/1
WITH RECURSIVE minmax AS
(
SELECT MIN(CAST(time AS DATE)) AS min, MAX(CAST(time as DATE)) AS max
FROM emp_time
),
dates AS
(
SELECT m.min as datepart
FROM minmax m
RIGHT JOIN emp_time e ON m.min = CAST(e.time as DATE)
UNION ALL
SELECT d.datepart + 1 FROM dates d, minmax mm
WHERE d.datepart + 1 <= mm.max
)
SELECT d.datepart as date, e.emp, MIN(e.time) as intime, MAX(e.time) as outtime FROM dates d
LEFT JOIN emp_time e ON d.datepart = CAST(e.time as DATE)
GROUP BY d.datepart, e.emp
ORDER BY d.datepart;

Insert subquery date according to day

I would like to insert subquery a date based on it day. Plus, each date can only be used four times. Once it reached fourth times, the fifth value will use another date of same day. In other word, use date of Monday of next week. Example, Monday with 6 JUNE 2016 to Monday with 13 JUNE 2016 (you may check the calendar).
I have a query of getting a list of date based on presentationdatestart and presentationdateend from presentation table:
select a.presentationid,
a.presentationday,
to_char (a.presentationdatestart + delta, 'DD-MM-YYYY', 'NLS_CALENDAR=GREGORIAN') list_date
from presentation a,
(select level - 1 as delta
from dual
connect by level - 1 <= (select max (presentationdateend - presentationdatestart)
from presentation))
where a.presentationdatestart + delta <= a.presentationdateend
and a.presentationday = to_char(a.presentationdatestart + delta, 'fmDay')
order by a.presentationdatestart + delta,
a.presentationid; --IMPORTANT!!!--
For example,
presentationday presentationdatestart presentationdateend
Monday 01-05-2016 04-06-2016
Tuesday 01-05-2016 04-06-2016
Wednesday 01-05-2016 04-06-2016
Thursday 01-05-2016 04-06-2016
The query result will list all possible dates between 01-05-2016 until 04-06-2016:
Monday 02-05-2016
Tuesday 03-05-2016
Wednesday 04-05-2016
Thursday 05-05-2016
....
Monday 30-05-2016
Tuesday 31-05-2016
Wednesday 01-06-2016
Thursday 02-06-2016 (20 rows)
This is my INSERT query :
insert into CSP600_SCHEDULE (studentID,
studentName,
projectTitle,
supervisorID,
supervisorName,
examinerID,
examinerName,
exavailableID,
availableday,
availablestart,
availableend,
availabledate)
select '2013816591',
'mong',
'abc',
'1004',
'Sue',
'1002',
'hazlifah',
2,
'Monday', //BASED ON THIS DAY
'12:00:00',
'2:00:00',
to_char (a.presentationdatestart + delta, 'DD-MM-YYYY', 'NLS_CALENDAR=GREGORIAN') list_date //FOR AVAILABLEDATE
from presentation a,
(select level - 1 as delta
from dual
connect by level - 1 <= (select max (presentationdateend - presentationdatestart)
from presentation))
where a.presentationdatestart + delta <= a.presentationdateend
and a.presentationday = to_char(a.presentationdatestart + delta, 'fmDay')
order by a.presentationdatestart + delta,
a.presentationid;
This query successfully added 20 rows because all possible dates were 20 rows. I would like modify the query to be able to insert based on availableDay and each date can only be used four times for each different studentID.
Possible outcome in CSP600_SCHEDULE (I am removing unrelated columns to ease readability):
StudentID StudentName availableDay availableDate
2013 abc Monday 01-05-2016
2014 def Monday 01-05-2016
2015 ghi Monday 01-05-2016
2016 klm Monday 01-05-2016
2010 nop Tuesday 02-05-2016
2017 qrs Tuesday 02-05-2016
2018 tuv Tuesday 02-05-2016
2019 wxy Tuesday 02-05-2016
.....
2039 rrr Monday 09-05-2016
.....
You may check the calendar :)
I think what you're asking for is to list your students and then batch them up in groups of 4 - each batch is then allocated to a date. Is that right?
In which case something like this should work (I'm using a list of tables as the student names just so I don't need to insert any data into a custom table) :
WITH students AS
(SELECT table_name
FROM all_tables
WHERE rownum < 100
)
SELECT
table_name
,SYSDATE + (CEIL(rownum/4) -1)
FROM
students
;
I hope that helps you
...okay, following your comments, I think this might be a better solution :
WITH students AS
(SELECT table_name student_name
FROM all_tables
WHERE rownum < 100
)
, dates AS
(SELECT TRUNC(sysdate) appointment_date from dual UNION
SELECT TRUNC(sysdate+2) from dual UNION
SELECT TRUNC(sysdate+4) from dual UNION
SELECT TRUNC(sysdate+6) from dual UNION
SELECT TRUNC(sysdate+8) from dual UNION
SELECT TRUNC(sysdate+10) from dual UNION
SELECT TRUNC(sysdate+12) from dual UNION
SELECT TRUNC(sysdate+14) from dual
)
SELECT
s.student_name
,d.appointment_date
FROM
--get a list of students each with a sequential row number, ordered by student name
(SELECT
student_name
,ROW_NUMBER() OVER (ORDER BY student_name) rn
FROM students
) s
--get a list of available dates with a sequential row number, ordered by date
,(SELECT
appointment_date
,ROW_NUMBER() OVER (ORDER BY appointment_date) rn
FROM dates
) d
WHERE 1=1
--allocate the first four students to date rownumber1, next four students to date rownumber 2...
AND CEIL(s.rn/4) = d.rn
;

Postgres: longest streak per developer regardless of Saturdays and Sundays

I got the information I needed from my last post about Postgres: Defining the longest streak (in days) per developer.
However now I want know the longest streak per developer regardless of Saturdays or Sundays. For instance, Bob worked from Thursday 18, Friday 19, Monday 22 and Tuesday 23, hence Bob streak is 4 days.
I understand I can use the DOW window function, which gives me 0 as Sunday , 1 Monday and so on. But
I don’t see how I can apply DOW function in the last solution proposed by Gordon Linoff.
Can some of you help me in this matter? Cheers,
WITH
working_limits AS (
SELECT
MIN(mr_date) AS start_date,
MAX(mr_date) AS end_date
FROM
xxx
),
working_days AS (
SELECT
ROW_NUMBER() OVER () AS day_number,
s.d::date AS date
FROM
GENERATE_SERIES((SELECT start_date FROM working_limits),
(SELECT end_date FROM working_limits),
'1 day') AS s(d)
WHERE
EXTRACT(dow FROM s.d) BETWEEN 1 AND 5),
worked_days AS (
SELECT
ROW_NUMBER() OVER () AS day_number,
developer,
mr_date AS date
FROM
xxx
ORDER BY
developer,
mr_date
)
SELECT
y.developer,
MAX(y.days)
FROM (
SELECT
x.developer,
COUNT(*) AS days
FROM (
SELECT
wngd.date,
wd.developer,
wngd.day_number - wd.day_number AS delta
FROM
working_days wngd INNER JOIN worked_days wd
ON
wngd.date = wd.date) AS x
GROUP BY
x.developer,
x.delta) AS y
GROUP BY
y.developer;