Changes from SQL Query and Update Statement - tsql

I have a query that produces a random departure dates from 1 to 28 days after the arrival date field:
--Query--
SELECT ArrivalDate, DATEADD(day, 1 + RAND(checksum(NEWID()))
* LengthOfStay.LengthofStay, ArrivalDate) AS DepartureDate
FROM Bookings, LengthOfStay
However when I run the Update query the randomisng reduced down to 1 or 2 days, can anyone advise why this is?
--Update Statement--
USE Occupancy
Update B
Set DepartureDate = DATEADD(day, 1 + RAND(checksum(NEWID()))*1.5 * L.LengthofStay, B.ArrivalDate)
FROM LengthOfStay L, Bookings B
Thanks
Wayne

I used the solution below:
UPDATE BOOKINGS
SET DepartureDate =
DATEADD(day,
CASE WHEN Rand(CHECKSUM(NEWID())) BETWEEN 0 and 0.3 THEN 2 ELSE
CASE WHEN Rand(CHECKSUM(NEWID())) BETWEEN 0.3 and 0.5 THEN 3 ELSE
Round(Rand(CHECKSUM(NEWID())) * 28,0) END END,ArrivalDate)

Related

How can I, in T-SQL, examine date intervals to remove overlapping intervals before adding totals together

I am running an analysis on medication prescribing practices. We want to identify whether someone has been on a class of medications for 60 days out of a 90 day quarter. We have a start and end date for each prescription, and the bounds of the quarter (e.g., 4/1/2022 – 6/30/2022). For each prescription I’ve calculated the number of days between the start and end date (only including days that fall within the bounds of the quarter). There are many instances in which multiple drugs within the same class are prescribed someone might try one antidepressant but not like it, so be given another in the same class.
My original strategy was just to total up number of days for each class of medication and see if it’s 60 or over. The days don’t have to be consecutive, but if they overlap, days during an overlap period shouldn’t count twice (which they would in a simple sum).
For instance in the data table below, patient 1 in row 1 should be included as they are over 60 days. Patient 2 should also get in (rows 2 and 3) because the non-overlapping total (57+8) within the same med class gets them to over 60 days. However, patient 3 should NOT get in, even though the total of 32 + 32 is over 60 because the intervals overlap. This means that they were really on the medication class for only 32 days – this is an instance where someone might be on two different antidepressants simultaneously.
It’s not sufficient to just sum the days in the interval, but I also have to include some way to examine whether the intervals are overlapping and only add days if an interval for a given medication class falls outside another interval for that same class.
Row num Patid Med class Start date End date Interval
1 1 A 2022-04-28 2022-09-12 63
2 2 B 2022-05-03 2022-06-29 57
3 2 B 2022-04-21 2022-04-29 8
4 3 A 2022-01-19 2022-05-03 32
5 3 A 2022-01-19 2022-05-03 32
I’m having a hard time figuring out how to do this. Note, I'm limited to just using SQL for this.
Code that produced the above data. I would embed this in another query to generate a total interval but need to deal with the overlap issue.
DECLARE #startdt DATE;
DECLARE #enddt DATE;
SET #startdt='4/1/2022'
SET #enddt='6/30/2022'
--for q4 fy2022-23 (4/1/2022-6/30/2022)`
SELECT DISTINCT
rx.patid, d.medication_category as medcat, start_date, end_date,
-- case statement to capture days within quarter only
CASE WHEN start_date<#startdt and end_date>#enddt then 90
WHEN start_date<#startdt and end_date>=#startdt then datediff(d,#startdt,end_date)
WHEN start_date>=#startdt and end_date>#enddt then datediff(d,start_date,#enddt)
ELSE datediff(d,start_date,end_date)
END as interval
FROM rx
INNER JOIN Drug_names_categories d
ON rx.drugname=d.drugname
WHERE start_date<'7/1/2022' and end_date>'3/30/2022'
AND rx.patid IS NOT NULL
AND d.medication_category IS NOT NULL
AND d.medication_category <>''
You can accomplish what you want by generating a calendar table (using a Common Table Expression) of individual days within the test range, joining those days with the prescriptions with overlapping days, and then counting distinct days for each patient and medication category combination.
Something like:
DECLARE #startdt DATE = '2022-04-01';
DECLARE #enddt DATE = '2022-06-30';
DECLARE #threshold INT = 60;
WITH Days AS (
SELECT #startdt AS Day
UNION ALL
SELECT DATEADD(day, 1, Day)
FROM Days
WHERE Day < #enddt
)
SELECT
rx.patid, d.medication_category as medcat,
COUNT(DISTINCT DD.Day) AS days_medicated,
MIN(DD.Day) AS start_date,
MAX(DD.Day) AS end_date
FROM rx
INNER JOIN Drug_names_categories d
ON rx.drugname = d.drugname
INNER JOIN Days DD
ON DD.Day BETWEEN rx.start_date AND rx.end_date
WHERE rx.start_date <= #enddt AND #startdt <= rx.end_date
GROUP BY rx.patid, d.medication_category
HAVING COUNT(DISTINCT DD.Day) >= #threshold
ORDER BY rx.patid, start_date;
If using SQL Server 2022 or later, the Days generator can be simplified by using the new GENERATE_SERIES() function:
WITH Days AS (
SELECT DATEADD(day, S.value, #startdt) AS Day
FROM GENERATE_SERIES(0, DATEDIFF(day, #Startdt, #enddt)) S
)
See this db<>fiddle for an example with some sample data.
I would do this using a date/calendar table, then it's pretty easy.
If you don't already have a date table, this link is one of many that describe how to create one easily ( https://www.mssqltips.com/sqlservertip/4054/creating-a-date-dimension-or-calendar-table-in-sql-server/ )
Here's the script from this link (in case the link dies)
DECLARE #StartDate date = '20100101';
DECLARE #CutoffDate date = DATEADD(DAY, -1, DATEADD(YEAR, 30, #StartDate));
;WITH seq(n) AS
(
SELECT 0 UNION ALL SELECT n + 1 FROM seq
WHERE n < DATEDIFF(DAY, #StartDate, #CutoffDate)
),
d(d) AS
(
SELECT DATEADD(DAY, n, #StartDate) FROM seq
),
src AS
(
SELECT
TheDate = CONVERT(date, d),
TheDay = DATEPART(DAY, d),
TheDayName = DATENAME(WEEKDAY, d),
TheWeek = DATEPART(WEEK, d),
TheISOWeek = DATEPART(ISO_WEEK, d),
TheDayOfWeek = DATEPART(WEEKDAY, d),
TheMonth = DATEPART(MONTH, d),
TheMonthName = DATENAME(MONTH, d),
TheQuarter = DATEPART(Quarter, d),
TheYear = DATEPART(YEAR, d),
TheFirstOfMonth = DATEFROMPARTS(YEAR(d), MONTH(d), 1),
TheLastOfYear = DATEFROMPARTS(YEAR(d), 12, 31),
TheDayOfYear = DATEPART(DAYOFYEAR, d)
FROM d
)
SELECT *
INTO MyDateTable
FROM src
ORDER BY TheDate
OPTION (MAXRECURSION 0);
No that you have your new date table you can join to it to get the list of dates that are within the start and end date, something like
SELECT DISTINCT COUNT(TheDate)
FROM rx
INNER JOIN MyDateTable dt on dt BETWEEN rx.start_date AND rx.end_date
INNER JOIN Drug_names_categories d ON rx.drugname=d.drugname
WHERE start_date<'7/1/2022' and end_date>'3/30/2022'
AND rx.patid IS NOT NULL
AND d.medication_category IS NOT NULL
AND d.medication_category <>''
Obviously this is simple example but you could extend this easily to include all the details you need, the point is that you now have a list of dates or distinct list of dates which you can work with easily.
You could also simply the date range applied by referencing the TheQuarter and TheYear columns. If this is a common task consider extending the date table to contain a comound YearQurater columns (e.g. 2023Q1/202301 etc)

Get difference between two dates in Years

I am working with a table that has StartDate and EndDate fields. I need to find difference between then in years.
Example:
StartDate = 1/1/2017
EndDate = 12/31/2017
I expect Result = 1 for the date difference.
Also, I'd like to round it to nearest whole number.
Example:
StartDate = 1/1/2017
EndDate = 11/30/2017
I expect Result = 1 for the date difference.
Using datediff function, I am able to get the result, but it isn't rounding to nearest whole number.
Example query:
I am getting 6 years even though 65 months / 12 would be less than 5.5:
select (DATEDIFF(yy, '01/01/2016', '5/31/2021')
+ CASE WHEN abs(DATEPART(day, '01/01/2016') - DATEPART(day, '05/31/2021')) > 15 THEN 1 ELSE 0 END)
select (DATEDIFF(mm, '01/01/2016', '05/31/2021')
+ CASE WHEN abs(DATEPART(day, '01/01/2016') - DATEPART(day, '05/31/2021')) > 15 THEN 1 ELSE 0 END)
DECLARE #startdate DATETIME = '1-1-2017',
#enddate DATETIME = '12-31-2018'
SELECT #startdate as StartDate, #enddate as EndDate,
DATEDIFF(YEAR, #startdate, #enddate)
-
(CASE
WHEN DATEADD(YEAR,
DATEDIFF(YEAR, #startdate,#enddate), #startdate)
> #enddate THEN 1
ELSE 0 END) 'Date difference in Years'
Use this code, I hope it will help you.
So far following query seems to be working okay. My mistake was I dividing by 12 instead of 12.0 for rounding to work correctly. Who knew! :
select
Round((DATEDIFF(mm, '01/01/2016', '07/1/2017')
+ CASE WHEN abs(DATEPART(day, '01/01/2016') - DATEPART(day, '06/30/2017')) > 15 THEN 1 ELSE 0 END) / 12.0, 0)
This may be a bit old but when using Oracle SQL Developer you can use the following. Just add your Dates below. I was using DateTime. This was used to get years between 0 and 10.
TRUNC((MONTHS_BETWEEN(<DATE_ONE>, <DATE_TWO>) * 31) / 365) > 0 and TRUNC((MONTHS_BETWEEN(<DATE_ONE>, <DATE_TWO>) * 31) / 365) < 10

Column of counts for time intervals

I want to get a table that constructs a column that tracks how many times an id appears in a given week. If the id appears once it is given a 1, if it appears twice it is given a 2, but if it appears more than two times it is given a 0.
id date
a 2015-11-10
a 2015-11-25
a 2015-11-09
b 2015-11-10
b 2015-11-09
a 2015-11-05
b 2015-11-23
b 2015-11-28
b 2015-12-04
a 2015-11-10
b 2015-12-04
a 2015-12-07
a 2015-12-09
c 2015-11-30
a 2015-12-06
c 2015-10-31
c 2015-11-04
b 2015-12-01
a 2015-10-30
a 2015-12-14
the one week intervals are given as follows
1 - 2015-10-30 to 2015-11-05
2 - 2015-11-06 to 2015-11-12
3 - 2015-11-13 to 2015-11-19
4 - 2015-11-20 to 2015-11-26
5 - 2015-11-27 to 2015-12-03
6 - 2015-12-04 to 2015-12-10
7 - 2015-12-11 to 2015-12-17
The table should look like this.
id interval count
a 1 2
b 1 0
c 1 2
a 2 0
b 2 2
c 2 0
a 3 0
b 3 0
c 3 0
a 4 1
b 4 1
c 4 0
a 5 0
b 5 2
c 5 1
a 6 0
b 6 2
c 6 0
a 7 1
b 7 0
c 7 0
The interval column doesn't have to be there, I simply added it for clarity.
I am new to sql and am unsure how to break the dates into intervals. The only thing I have is grouping by date and counting.
Select id ,date, count (*) as frequency
from data_1
group by id, date having frequency <= 2;
Looking at just the data you provided, this does the trick:
SELECT v.id,
i.interval,
coalesce((CASE WHEN sub.cnt < 3 THEN sub.cnt ELSE 0 END), 0) AS count
FROM (VALUES('a'), ('b'), ('c')) v(id)
CROSS JOIN generate_series(1, 7) i(interval)
LEFT JOIN (
SELECT id, ((date - '2015-10-30')/7 + 1)::int AS interval, count(*) AS cnt
FROM my_table
GROUP BY 1, 2) sub USING (id, interval)
ORDER BY 2, 1;
A few words of explanation:
You have three id values which are here recreated with a VALUES clause. If you have many more or don't know beforehand which id's to enumerate, you can always replace the VALUES clause with a sub-query.
You provide a specific date range over 7 weeks. Since you might have weeks where a certain id is not present you need to generate a series of the interval values and CROSS JOIN that to the id values above. This yields the 21 rows you are looking for.
Then you calculate the occurrences of ids in intervals. You can subtract a date from another date which will give you the number of days in between. So subtract the date of the row from the earliest date, divide that by 7 to get the interval period, add 1 to make the interval 1-based and convert to integer. You can then convert counts of > 2 to 0 and NULL to 0 with a combination of CASE and coalesce().
The query outputs the interval too, otherwise you will have no clue what the data refers to. Optionally, you can turn this into a column which shows the date range of the interval.
More flexible solution
If you have more ids and a larger date range, you can use the below version which first determines the distinct ids and the date range. Note that the interval is now 0-based to make calculations easier. Not that it matters much because instead of the interval number, the corresponding date range is displayed.
WITH mi AS (
SELECT min(date) AS min, ((max(date) - min(date))/7)::int AS intv FROM my_table)
SELECT v.id,
to_char((mi.min + i.intv * 7)::timestamp, 'YYYY-mm-dd') || ' - ' ||
to_char((mi.min + i.intv * 7 + 6)::timestamp, 'YYYY-mm-dd') AS period,
coalesce((CASE WHEN sub.cnt < 3 THEN sub.cnt ELSE 0 END), 0) AS count
FROM mi,
(SELECT DISTINCT id FROM my_table) v
CROSS JOIN LATERAL generate_series(0, mi.intv) i(intv)
LEFT JOIN LATERAL (
SELECT id, ((date - mi.min)/7)::int AS intv, count(*) AS cnt
FROM my_table
GROUP BY 1, 2) sub USING (id, intv)
ORDER BY 2, 1;
SQLFiddle with both solutions.
Assuming you have a table of all users, this will do the trick.
select
users.id,
interval_table.id,
CASE
WHEN count(log_table.user_id)>2 THEN 0
ELSE count(log_table.user_id)
END
from users
cross join interval_table
left outer join log_table
on users.id = log_table.user_id
and log_table.event_date >= interval_table.start_interval
and log_table.event_date < interval_table.stop_interval
group by users.id, interval_table.id
order by interval_table.id, users.id
Check it out: http://sqlfiddle.com/#!15/1a822/21

TSQL Fill gaps in months and repeat values in column

Started with TSQL last Wednesday...
I have the following data in tblStage1:
PROJECT USERNAME DATE PERCENTAGE
--------- ----------------- ------------ ----------------------
Project 1 DOMAIN\Chris.User 03/01/2013 0.25
Project 1 DOMAIN\Chris.User 05/01/2013 0.75
Project 1 DOMAIN\Chris.User 07/01/2013 1
Project 1 DOMAIN\John.User 02/01/2013 1
Project 1 DOMAIN\John.User 06/01/2013 0.5
I have the following data in tblRawData
PROJECT START_DATE END_DATE
---------- ----------- ----------
Project 1 01/01/2013 09/01/2013
I would like to get the following data out into tblStage2 (data points are bound by START_DATE and END_DATE):
PROJECT USERNAME DATE PERCENTAGE
--------- ----------------- ------------ ----------------------
Project 1 DOMAIN\Chris.User 01/01/2013 0
Project 1 DOMAIN\Chris.User 02/01/2013 0
Project 1 DOMAIN\Chris.User 03/01/2013 0.25
Project 1 DOMAIN\Chris.User 04/01/2013 0.25
Project 1 DOMAIN\Chris.User 05/01/2013 0.75
Project 1 DOMAIN\Chris.User 06/01/2013 0.75
Project 1 DOMAIN\Chris.User 07/01/2013 1
Project 1 DOMAIN\Chris.User 08/01/2013 1
Project 1 DOMAIN\Chris.User 09/01/2013 1
Project 1 DOMAIN\John.User 01/01/2013 0
Project 1 DOMAIN\John.User 02/01/2013 1
Project 1 DOMAIN\John.User 03/01/2013 1
Project 1 DOMAIN\John.User 04/01/2013 1
Project 1 DOMAIN\John.User 05/01/2013 1
Project 1 DOMAIN\John.User 06/01/2013 0.5
Project 1 DOMAIN\John.User 07/01/2013 0.5
Project 1 DOMAIN\John.User 08/01/2013 0.5
Project 1 DOMAIN\John.User 09/01/2013 0.5
I realize that there are a number of topics that relate to this subject such as this. I my case, I don't have any particular restrictions and I am looking for a clean routine that is relatively easy to understand.
I know there is a DateAdd function, but I haven't seen any INSERT INTO commands in the example statements. I am confused as to how one would iterate through the data set and create the interpolated values. I am still too green to understand the full context of the other examples and would greatly appreciate any help or clarification.
Edit Added additional information to the sample data for a better indication of my final goal. I will have multiple users in this data set. The USERNAME column is placed into the data set by the original source (a people picker on an InfoPath form). All "Percentages" are "0" until the first value is assigned then they retain that value until it is changed or the project reaches its end date. I hope this helps clarify!
It's not clear how USERNAME populated. I assume you have same USERNAME on project here. CTE below just build a DATE table, if you have your own date table, you can skip this part.
SQL Fiddle
DECLARE #ENDDate DATETIME
SELECT #ENDDate = MAX(END_DATE) FROM tblRawData
;WITH tblDate AS
( SELECT CAST(MIN(START_DATE) AS DATE) AS [date]
FROM tblRawData
UNION ALL
SELECT DATEADD(month,1,[DATE])
FROM tblDate
WHERE [DATE] < #ENDDate
)
SELECT
d.[date]
,r.[Project]
,UserName = (SELECT MAX(USERNAME) FROM tblStage1 ts WHERE r.PROJECT = ts.PROJECT)
,Percentage = (SELECT ISNULL(MAX(Percentage),0) FROM tblStage1 ts WHERE r.PROJECT = ts.PROJECT AND ts.[date] <= d.[date])
FROM tblDate d
INNER JOIN tblRawData r
ON d.[date] between r.[START_DATE] AND r.[END_DATE]
ORDER BY 2,1
OPTION (Maxrecursion 0)
EDIT: Just found out the date is increased by month. I update the CTE query. However, you need make sure you have all project start and end date at first day of the month.
EDIT
Base on your new sample date. the query became a little ugly now, however, it works. I can not think better solution right now.
New SQL Fiddler
DECLARE #ENDDate DATETIME
SELECT #ENDDate = MAX(END_DATE) FROM tblRawData
;WITH tblDate AS
( SELECT CAST(MIN(START_DATE) AS DATE) AS [date]
FROM tblRawData
UNION ALL
SELECT DATEADD(month,1,[DATE])
FROM tblDate
WHERE [DATE] < #ENDDate
)
,ProjectList AS (
SELECT Project,UserName
FROM tblStage1
GROUP BY Project,UserName
)
,cte AS (
SELECT
d.[date]
,r.[Project]
,UserName = pl.Username
,CloseDate = (SELECT MAX(ts.[date]) FROM tblStage1 ts WHERE r.PROJECT = ts.PROJECT AND ts.UserName = pl.UserName AND ts.[date] <= d.[date])
FROM tblDate d
INNER JOIN tblRawData r
ON d.[date] between r.[START_DATE] AND r.[END_DATE]
CROSS APPLY ProjectList pl
)
SELECT cte.[date],cte.project,cte.UserName,ISNULL(t.[PERCENTAGE],0) AS PERCENTAGE
FROM cte
LEFT JOIN tblStage1 t
ON cte.PROJECT = t.PROJECT
AND cte.UserName = t.UserName
AND cte.CloseDate = t.[Date]
ORDER BY 2,3,1
You could do this with the APPLY operator as well...
WITH DateList (Project, MonthStart) AS
(
SELECT
project
, Start_Date
FROM
tblRawData
UNION ALL
SELECT
dl.Project
, DATEADD(MONTH, 1, dl.MonthStart)
FROM
DateList dl
JOIN
tblRawData r
ON
r.Project = dl.Project
AND
dl.MonthStart < r.End_Date
)
SELECT
dl.Project
, lastUser.UserName
, dl.monthstart [Date]
, ISNULL(pct.Percentage, 0) Percentage
FROM
DateList dl
CROSS APPLY
(
SELECT
TOP 1
USERNAME
FROM
tblStage1 t1
WHERE
t1.Project = dl.Project
ORDER BY
t1.Date DESC
) lastUser
OUTER APPLY
(
SELECT
TOP 1
Percentage
FROM
tblStage1 t2
WHERE
t2.Project = dl.Project
AND
t2.Date <= dl.MonthStart
ORDER BY
t2.Date DESC
) pct
A really easy way would be to create a temp table with the month values of each month you need then doing a left join on that table.
Out of curiosity - why is August 1 instead of 0?

Departure Date Greater Than Arrival Date

How do I make sure my departure date is greater than the Arrival date in the following code.
SELECT ArrivalDate, DATEADD(day, RAND(checksum(NEWID()))*1.5
* LengthOfStay.LengthofStay, ArrivalDate) AS DepartureDate
FROM Bookings, LengthOfStay
ORDER BY ArrivalDate
Thanks
Wayne
DATEADD takes an integer... and any decimal values returned from your randomization will just be truncated. So you're likely just adding 0 to the ArrivalDate, resulting in the two dates being equal.
You could fix this by just adding a minimum of 1 to your randomization:
SELECT ArrivalDate, DATEADD(day, 1 + RAND(checksum(NEWID()))*1.5
* LengthOfStay.LengthofStay, ArrivalDate) AS DepartureDate
FROM Bookings, LengthOfStay
ORDER BY ArrivalDate
Try this query
SELECT * FROM
(
SELECT
ArrivalDate,
DATEADD(day, RAND(checksum(NEWID()))*1.5 * LengthOfStay.LengthofStay, ArrivalDate) AS DepartureDate
FROM
Bookings, LengthOfStay
) a
WHERE a.DepartureDate > a.ArrivalDate
ORDER BY a.ArrivalDate