Fill table with two datetime columns with random dates - tsql

I have table T1 with two datetime columns (StartDate, EndDate) which I must populate with random dates under one circumstance:
EndDate value must be greater than StartDate in minimal one day.
Example:
StartDate EndDate
===========================
2001-04-04 2001-04-06 (2 days)
2001-01-05 2001-01-15 (10 days)
.
.
.
Can I do that in one statement?
P.S. My first idea was to change EndDate column to NULL, and in first step populate StartDate leaving EndDate as NULL, and in second statement to write some mechanism to update EndDate with dates greater (in different number of days for every record) then StartDate.

Here's a solution that populates the table in one step:
insert into T1 (StartDate, EndDate)
select
X.StartDate,
dateadd(day, abs(checksum(newid())) % 10, X.StartDate) EndDate
from (
select top 20
dateadd(day, -abs(checksum(newid())) % 100, convert(date, getDate())) StartDate
from sys.columns c1, sys.columns c2
) X
The query above uses some tricks that I personally often use in ad-hoc SQL queries:
new_Id() generates different random values for each row, as opposed to RAND(), which would be evaluated once per query. The expression abs(checksum(newid())) % N will generate random integer values in the 0 - N-1 range.
the TOP X ... FROM sys.columns c1, sys.columns c2 trick allows you to generate X rows whose values can be composed of scalar values, like the ones in our example.
Obviously, you can modify the hardcoded values in the above query to:
generate more rows
increase the range of random start dates
increase the maximum duration of each row.

INSERT T1 (StartDate, EndDate)
select T1, T1 + add_days
from
(select DATEADD(day, (ABS(CHECKSUM(NEWID())) % 65530), 0) T1,
ROW_NUMBER() OVER(ORDER BY number) add_days
from [ master ] .. spt_values) X;
sqlfiddle example

Something simple using rand() function:
Fiddle Example
declare #records int = 100, --Number of records needed
#count int = 0, #start int, #end int
while(#records>#count)
begin
select #start = rand() * 10, #end = rand() * 100, #count+=1
insert into mytable
select dateadd(day, #start, getdate()),dateadd(day, #end, getdate())
end
select * from mytable

Related

TSQL Get Item Price History from Item Price Changes

I have a table of item price changes, and I want to use it to create a table of item prices for each date (between the item's launch and end dates).
Here's some code to create the date:-
declare #Item table (item_id int, item_launch_date date, item_end_date date);
insert into #Item Values (1,'2001-01-01','2016-01-01'), (2,'2001-01-01','2016-01-01')
declare #ItemPriceChanges table (item_id int, item_price money, my_date date);
INSERT INTO #ItemPriceChanges VALUES (1, 123.45, '2001-01-01'), (1, 345.34, '2001-01-03'), (2, 34.34, '2001-01-01'), (2,23.56 , '2005-01-01'), (2, 56.45, '2016-05-01'), (2, 45.45, '2017-05-01'); ;
What I'd like to see is something like this:-
item_id date price
------- ---- -----
1 2001-01-01 123.45
1 2001-01-02 123.45
1 2001-01-03 345.34
1 2001-01-04 345.34
etc.
2 2001-01-01 34.34
2 2001-01-02 34.34
etc.
Any suggestions on how to write the query?
I'm using SQL Server 2016.
Added:
I also have a calendar table called "dim_calendar" with one row per day. I had hoped to use a windowing function, but the nearest I can find is lead() and it doesn't do what I thought it would do:-
select
i.item_id,
c.day_date,
ipc.item_price as item_price_change,
lead(item_price,1,NULL) over (partition by i.item_id ORDER BY c.day_date) as item_price
from dim_calendar c
inner join #Item i
on c.day_date between i.item_launch_date and i.item_end_date
left join #ItemPriceChanges ipc
on i.item_id=ipc.item_id
and ipc.my_date=c.day_date
order by
i.item_id,
c.day_date;
Thanks
I wrote this prior to your edit. Note that your sample output suggests that an item can have two prices on the day of the price change. The following assumes that an item can only have one price on a price change day and that is the new price.
declare #Item table (item_id int, item_launch_date date, item_end_date date);
insert into #Item Values (1,'2001-01-01','2016-01-01'), (2,'2001-01-01','2016-01-01')
declare #ItemPriceChange table (item_id int, item_price money, my_date date);
INSERT INTO #ItemPriceChange VALUES (1, 123.45, '2001-01-01'), (1, 345.34, '2001-01-03'), (2, 34.34, '2001-01-01'), (2,23.56 , '2005-01-01'), (2, 56.45, '2016-05-01'), (2, 45.45, '2017-05-01');
SELECT * FROM #ItemPriceChange
-- We need a table variable holding all possible date points for the output
DECLARE #DatePointList table (DatePoint date);
DECLARE #StartDatePoint date = '01-Jan-2001';
DECLARE #MaxDatePoint date = GETDATE();
DECLARE #DatePoint date = #StartDatePoint;
WHILE #DatePoint <= #MaxDatePoint BEGIN
INSERT INTO #DatePointList (DatePoint)
SELECT #DatePoint;
SET #DatePoint = DATEADD(DAY,1,#DatePoint);
END;
-- We can use a CTE to sequence the price changes
WITH ItemPriceChange AS (
SELECT item_id, item_price, my_date, ROW_NUMBER () OVER (PARTITION BY Item_id ORDER BY my_date ASC) AS SeqNo
FROM #ItemPriceChange
)
-- With the price changes sequenced, we can derive from and to dates for each price and use a join to the table of date points to produce the output. Also, use an inner join back to #item to only return rows for dates that are within the start/end date of the item
SELECT ItemPriceDate.item_id, DatePointList.DatePoint, ItemPriceDate.item_price
FROM #DatePointList AS DatePointList
INNER JOIN (
SELECT ItemPriceChange.item_id, ItemPriceChange.item_price, ItemPriceChange.my_date AS from_date, ISNULL(ItemPriceChange_Next.my_date,#MaxDatePoint) AS to_date
FROM ItemPriceChange
LEFT OUTER JOIN ItemPriceChange AS ItemPriceChange_Next ON ItemPriceChange_Next.item_id = ItemPriceChange.item_id AND ItemPriceChange.SeqNo = ItemPriceChange_Next.SeqNo - 1
) AS ItemPriceDate ON DatePointList.DatePoint >= ItemPriceDate.from_date AND DatePointList.DatePoint < ItemPriceDate.to_date
INNER JOIN #item AS item ON item.item_id = ItemPriceDate.item_id AND DatePointList.DatePoint BETWEEN item.item_launch_date AND item.item_end_date
ORDER BY ItemPriceDate.item_id, DatePointList.DatePoint;
#AlphaStarOne Perfect! I've modified it to use a Windowing function rather than a self-join, but what you've suggested works. Here's my implementation of that in case anyone else needs it:
SELECT
ipd.item_id,
dc.day_date,
ipd.item_price
FROM dim_calendar dc
INNER JOIN (
SELECT
item_id,
item_price,
my_date AS from_date,
isnull(lead(my_date,1,NULL) over (partition by item_id ORDER BY my_date),getdate()) as to_date
FROM #ItemPriceChange ipc1
) AS ipd
ON dc.day_date >= ipd.from_date
AND dc.day_date < ipd.to_date
INNER JOIN #item AS i
ON i.item_id = ipd.item_id
AND dc.day_date BETWEEN i.item_launch_date AND i.item_end_date
ORDER BY
ipd.item_id,
dc.day_date;

How to rewrite SQL joins into window functions?

Database is HP Vertica 7 or PostgreSQL 9.
create table test (
id int,
card_id int,
tran_dt date,
amount int
);
insert into test values (1, 1, '2017-07-06', 10);
insert into test values (2, 1, '2017-06-01', 20);
insert into test values (3, 1, '2017-05-01', 30);
insert into test values (4, 1, '2017-04-01', 40);
insert into test values (5, 2, '2017-07-04', 10);
Of the payment cards used in the last 1 day, what is the maximum amount charged on that card in the last 90 days.
select t.card_id, max(t2.amount) max
from test t
join test t2 on t2.card_id=t.card_id and t2.tran_dt>='2017-04-06'
where t.tran_dt>='2017-07-06'
group by t.card_id
order by t.card_id;
Results are correct
card_id max
------- ---
1 30
I want to rewrite the query into sql window functions.
select card_id, max(amount) over(partition by card_id order by tran_dt range between '60 days' preceding and current row) max
from test
where card_id in (select card_id from test where tran_dt>='2017-07-06')
order by card_id;
But result set does not match, how can this be done?
Test data here:
http://sqlfiddle.com/#!17/db317/1
I can't try PostgreSQL, but in Vertica, you can apply the ANSI standard OLAP window function.
But you'll need to nest two queries: The window function only returns sensible results if it has all rows that need to be evaluated in the result set.
But you only want the row from '2017-07-06' to be displayed.
So you'll have to filter for that date in an outer query:
WITH olap_output AS (
SELECT
card_id
, tran_dt
, MAX(amount) OVER (
PARTITION BY card_id
ORDER BY tran_dt
RANGE BETWEEN '90 DAYS' PRECEDING AND CURRENT ROW
) AS the_max
FROM test
)
SELECT
card_id
, the_max
FROM olap_output
WHERE tran_dt='2017-07-06'
;
card_id|the_max
1| 30
As far as I know, PostgreSQL Window function doesn't support bounded range preceding thus range between '90 days' preceding won't work. It does support bounded rows preceding such as rows between 90 preceding, but then you would need to assemble a time-series query similar to the following for the Window function to operate on the time-based rows:
SELECT c.card_id, t.amount, g.d as d_series
FROM generate_series(
'2017-04-06'::timestamp, '2017-07-06'::timestamp, '1 day'::interval
) g(d)
CROSS JOIN ( SELECT distinct card_id from test ) c
LEFT JOIN test t ON t.card_id = c.card_id and t.tran_dt = g.d
ORDER BY c.card_id, d_series
For what you need (based on your question description), I would stick to using group by.

How to calculate the no of iso weeks in a year in t-sql?

I need a simple query to calculate the no of iso weeks in any given year?
I think this should do the trick.
DECLARE #year smallint = 2015;
SELECT
TheYear = #year,
ISOWeeks= MAX(DATEPART(ISOWK,DATEADD(DD,N,CAST(CAST(#year AS char(4))+'1223' AS date))))
FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8)) t(N);
You could include this logic in a function like this:
CREATE FUNCTION dbo.CalculateISOWeeks(#year smallint)
RETURNS TABLE WITH SCHEMABINDING AS RETURN
SELECT ISOWeeks =
MAX(DATEPART(ISOWK,DATEADD(DD,N,CAST(CAST(#year AS char(4))+'1223' AS date))))
FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8)) t(N);
and use it like this:
SELECT ISOWeeks FROM dbo.CalculateISOWeeks(2014);
or better yet... because we're calculating a static value, why not just pop those values into a table then index it like this:
SELECT
Yr = ISNULL(CAST(Yr AS smallint),0),
ISOWeeks = ISNULL(CAST(ISOWeeks AS tinyint),0)
INTO dbo.ISOCounts
FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL))+1949
FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) a(x),
(VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) b(x)
) Years(Yr)
CROSS APPLY dbo.CalculateISOWeeks(Yr+1950);
CREATE UNIQUE CLUSTERED INDEX uci_ISOCounts ON dbo.ISOCounts(Yr);
Now whenever you need to calculate the number of ISO weeks for a given year you can retrieve the pre-calculated value from your table via an index seek.
SELECT * FROM dbo.ISOCounts WHERE yr = 2014;
Results:
Yr ISOWeeks
------ --------
2014 53

Grouping consecutive dates in PostgreSQL

I have two tables which I need to combine as sometimes some dates are found in table A and not in table B and vice versa. My desired result is that for those overlaps on consecutive days be combined.
I'm using PostgreSQL.
Table A
id startdate enddate
--------------------------
101 12/28/2013 12/31/2013
Table B
id startdate enddate
--------------------------
101 12/15/2013 12/15/2013
101 12/16/2013 12/16/2013
101 12/28/2013 12/28/2013
101 12/29/2013 12/31/2013
Desired Result
id startdate enddate
-------------------------
101 12/15/2013 12/16/2013
101 12/28/2013 12/31/2013
Right. I have a query that I think works. It certainly works on the sample records you provided. It uses a recursive CTE.
First, you need to merge the two tables. Next, use a recursive CTE to get the sequences of overlapping dates. Finally, get the start and end dates, and join back to the "merged" table to get the id.
with recursive allrecords as -- this merges the input tables. Add a unique row identifier
(
select *, row_number() over (ORDER BY startdate) as rowid from
(select * from table1
UNION
select * from table2) a
),
path as ( -- the recursive CTE. This gets the sequences
select rowid as parent,rowid,startdate,enddate from allrecords a
union
select p.parent,b.rowid,b.startdate,b.enddate from allrecords b join path p on (p.enddate + interval '1 day')>=b.startdate and p.startdate <= b.startdate
)
SELECT id,g.startdate,g.enddate FROM -- outer query to get the id
-- inner query to get the start and end of each sequence
(select parent,min(startdate) as startdate, max(enddate) as enddate from
(
select *, row_number() OVER (partition by rowid order by parent,startdate) as row_number from path
) a
where row_number = 1 -- We only want the first occurrence of each record
group by parent)g
INNER JOIN allrecords a on a.rowid = parent
The below fragment does what you intend. (but it will probably be very slow) The problem is that detecteng (non)overlapping dateranges is impossible with standard range operators, since a range could be split into two parts.
So, my code does the following:
split the dateranges from table_A into atomic records, with one date per record
[the same for table_b]
cross join these two tables (we are only interested in A_not_in_B, and B_not_in_A) , remembering which of the L/R outer join wings it came from.
re-aggregate the resulting records into date ranges.
-- EXPLAIN ANALYZE
--
WITH RECURSIVE ranges AS (
-- Chop up the a-table into atomic date units
WITH ar AS (
SELECT generate_series(a.startdate,a.enddate , '1day'::interval)::date AS thedate
, 'A'::text AS which
, a.id
FROM a
)
-- Same for the b-table
, br AS (
SELECT generate_series(b.startdate,b.enddate, '1day'::interval)::date AS thedate
, 'B'::text AS which
, b.id
FROM b
)
-- combine the two sets, retaining a_not_in_b plus b_not_in_a
, moments AS (
SELECT COALESCE(ar.id,br.id) AS id
, COALESCE(ar.which, br.which) AS which
, COALESCE(ar.thedate, br.thedate) AS thedate
FROM ar
FULL JOIN br ON br.id = ar.id AND br.thedate = ar.thedate
WHERE ar.id IS NULL OR br.id IS NULL
)
-- use a recursive CTE to re-aggregate the atomic moments into ranges
SELECT m0.id, m0.which
, m0.thedate AS startdate
, m0.thedate AS enddate
FROM moments m0
WHERE NOT EXISTS ( SELECT * FROM moments nx WHERE nx.id = m0.id AND nx.which = m0.which
AND nx.thedate = m0.thedate -1
)
UNION ALL
SELECT rr.id, rr.which
, rr.startdate AS startdate
, m1.thedate AS enddate
FROM ranges rr
JOIN moments m1 ON m1.id = rr.id AND m1.which = rr.which AND m1.thedate = rr.enddate +1
)
SELECT * FROM ranges ra
WHERE NOT EXISTS (SELECT * FROM ranges nx
-- suppress partial subassemblies
WHERE nx.id = ra.id AND nx.which = ra.which
AND nx.startdate = ra.startdate
AND nx.enddate > ra.enddate
)
;

How to Calculate Gap Between two Dates in SQL Server 2005?

I have a data set as shown in the picture.
I am trying to get the date difference between eligenddate (First row) and eligstartdate (second row). I would really appreciate any suggestions.
Thank you
SQL2005:
One solution is to insert into a table variable (#DateWithRowNum - the number of rows is small) or into a temp table (#DateWithRowNum - the number of rows is high) the rows with a row number (generated using [elig]startdate as order by criteria; also see note #1) plus a self join thus:
DECLARE #DateWithRowNum TABLE (
memberid VARCHAR(50) NOT NULL,
rownum INT,
PRIMARY KEY(memberid, rownum),
startdate DATETIME NOT NULL,
enddate DATETIME NOT NULL
)
INSERT #DateWithRowNum (memberid, rownum, startdate, enddate)
SELECT memberid,
ROW_NUMBER() OVER(PARTITION BY memberid ORDER By startdate),
startdate,
enddate
FROM dbo.MyTable
SELECT crt.*, DATEDIFF(MONTH, crt.enddate, prev.startdate) AS gap
FROM #DateWithRowNum crt
LEFT JOIN #DateWithRowNum prev ON crt.memberid = prev.memberid AND crt.rownum - 1 = prev.rownum
ORDER BY crt.memberid, crt.rownum
Another solution is to use common table expression instead of table variable / temp table thus:
;WITH DateWithRowNum AS (
SELECT memberid,
ROW_NUMBER() OVER(PARTITION BY memberid ORDER By startdate),
startdate,
enddate
FROM dbo.MyTable
)
SELECT crt.*, DATEDIFF(MONTH, crt.enddate, prev.startdate) AS gap
FROM DateWithRowNum crt
LEFT /*HASH*/ JOIN DateWithRowNum prev ON crt.memberid = prev.memberid AND crt.rownum - 1 = prev.rownum
ORDER BY crt.memberid, crt.rownum
Note #1: I assume that you need to calculate these values for every memberid
Note #2: HASH hint forces SQL Server to evaluate just once every data source (crt or prev) of LEFT JOIN.