SAS Return data from the last available 5 days - date

I have a list of intra-day prices at 9am, 10am, 11am etc. for each day (which are in number formats such as 15011 15012 etc.)
I want to only keep the observations from the last available 5 days and the next 5 available days from the date 't' and delete everything else.
Is there a way to do this?
I tried using
if date < &t - 5 or date > &t + 5 then delete;
However, since there are weekends/holidays I don't get all the observations I want.
Thanks a lot in advance!

Not much info to go on, but here is a possible solution:
/* Invent some data */
data have;
do date=15001 to 15020;
do time='09:00't,'10:00't,'11:00't;
price = ranuni(0) * 10;
output;
end;
end;
run;
/* Your macro variable identifying the target "date" */
%let t=15011;
/* Subset for current and following datae*/
proc sort data=have out=temp(where=(date >= &t));
by date;
run;
/* Process to keep only current and following five days */
data current_and_next5;
set temp;
by date;
if first.date then keep_days + 1; /* Set counter for each day */
if keep_days <= 6; /* Days to keep (target and next five) */
drop keep_days; /* Drop this utility variable */
run;
/* Subset for previous and sort by date descending */
proc sort data=have out=temp(where=(date < &t));
by descending date;
run;
/* Process to keep only five previous days */
data prev5;
set temp;
by descending date;
if first.date then keep_days + 1; /* Set counter for each day */
if keep_days <= 5; /* Number of days to keep */
drop keep_days; /* Drop this utility variable */
run;
/* Concatenate together and re-sort by date */
data want;
set current_and_next5
prev5;
run;
proc sort data=want;
by date;
run;
Of course, this solution suggests that your starting data contains observations for all valid "trading days" and returns everything without doing date arithmetic. A much better solution would require that you create a "trading calendar" dataset with all valid dates. You can easily deal with weekends, but holidays and other "non-trading days" are very site specific; hence using a calendar is almost always preferred.
UPDATE: Joe's comment made me re-read the question more carefully. This should return a total of eleven (11) days of data; five days prior, five days following, and the target date. But still, a better solution would use a calendar reference table.

Try this
/* Get distinct dates before and after &T */
proc freq data=mydata noprint ;
table Date /out=before (where=(Date < &T)) ;
table Date /out=after (where=(Date > &T)) ;
run ;
/* Take 5 days before and after */
proc sql outobs=5 ;
create table before2 as
select Date
from before
order by Date descending ;
create table after2 as
select Date
from after
order by Date ;
quit ;
/* Subset to 5 available days before & after */
proc sql ;
create table final as
select *
from mydata
where Date >= (select min(date) from before2)
and Date <= (select max(date) from after2)
order by Date ;
quit ;

Related

SAS Proc SQL function in where clause for only current week

I have some data in week-date-time format ie 14dec2020 00:00:00:0000. I am using SAS and a proc sql query
This data contains many weeks worth of data but i was curious if theres in way to only pull data relevant to only current week? IE today is 17dec2020 so i would want to only pull data for the week of 14dec2020. if today was 22dec2020 then i would want it to pull data for the week of 21dec2020.
Here is one of the many queries i have tried.
data have;
today = today();
wkday = weekday(today);
start = today - (wkday - 1);
end = today + (7 - wkday);
length cstart cend $30;
cstart = put(start, date9.) || ' 00:00:00.0000' ;
cend = put(end, date9.) || ' 00:00:00.0000' ;
call symput('start', cstart);
call symput('end', cend);
run;
Proc Sql;
connect to odbc (environment=x user=y p=z);
create table basic.curweek as select * from connection to odbc
(select year, month, week, store, sales, SKU
from datatable
where (&start. <= week <= &end.)
order by sku);
disconnect from odbc;
quit;
Thanks to the help of the great people below i have gotten to this state. But am still facing some syntax errors.
Any help here would be greatly appreciated!!
Use intnx() to align both the datetime of interest and today's datetime to the start of the week.
proc sql;
create table want as
select *
from table
where intnx('dtweek', date, 0, 'B') = intnx('dtweek', datetime(), 0, 'B')
;
quit;
As others have pointed out - if you are using SQL pass-thru, you need to use date functions that exist in your "flavor" of SQL. SAS specific functions will not work, and in particular SAS function "today()" has no meaning in the SQL you are working with.
The approach I would take is:
in a SAS datastep - get today's date
use today's date to calculate beginning and end of the week
convert beginning/end dates to character strings
(string will depend on how dates are formatted in your sql database - date or datetime)
use character strings to create macro variables
feed macro variables into sql pass-thru query to subset dates wanted
Below is some example code. It might not get you all the way there, but could give you some more ideas to try.
data have;
today = today(); *** TODAYs DATE ***;
wkday = weekday(today); *** WEEK DAY NUMBER FOR TODAY, POSSIBLE VALUES ARE 1-7 ***;
start = today - (wkday - 1); *** CALCULATE SUNDAY ***;
end = today + (7 - wkday); *** CALCULATE SATURDAY ***;
*** UNCOMMENT AND USE BELOW IF WEEK START/END IS MON-FRI ***;
*start = today - (wkday - 2); *** CALCULATE MONDAY ***;
*end = today + (6 - wkday); *** CALCULATE FRIDAY ***;
*** REPRESENT DATES AS DATE-TIME CHARACTER STRING - SURROUNDED BY SINGLE QUOTES ***;
cstart = "'" || put(start, date9.) || ' 00:00:00.0000' || "'";
cend = "'" || put(end, date9.) || ' 00:00:00.0000' || "'";
*** USE CHARACTER VARIABLES TO CREATE MACRO VARIABLES ***;
call symput('start', cstart);
call symput('end', cend);
run;
*** IN SQL PASS-THRU, USE MACRO VARIABLES IN WHERE STATEMENT TO SUBSET ONE WEEK ***;
Proc Sql
connect to odbc (environment=x user=y p=z);
create table basic.curweek as select * from connection to odbc
(select year, month, week, store, sales, SKU
from datatable
where (&start. <= week and week <= &end.)
order by sku);
disconnect from odbc;
quit;

SAS 94 How to calculate the number of days until next record

Using SAS I want to be able to calculate the number of days between two dates where the value is the number of days until the next record.
The required output will be:
Date Num Days
10/09/2020 1
11/09/2020 1
12/09/2020 1
14/09/2020 2
15/09/2020 1
16/09/2020 1
17/09/2020 1
18/09/2020 1
20/09/2020 2
I have tried using Lag and Retain but just cant get it work.
Any advice and suggestions would be really appreciated.
If you sort the data by descending DATE then it is easier because then you just need to look backwards to find the next date. So you can use LAG() or DIF() function.
data want;
set have;
by descending date;
num_days = dif(date);
run;
To simulate a "lead" function you can set another copy of the data skipping the first observation.
data want;
set have ;
set have(firstobs=2 keep=date rename=(date=next_date)) have(obs=1 drop=_all_);
num_days = next_date - date;
run;

SAS trying to get the difference in days from orderdate to shipdate

I'm using adventureworks dataset.
Looking for methods to calculate on average how long in days does it take ADW to deliver products between the order date and shipment date.
format sas date9.
e.g.:
orderdate shipdate
01JUL2005:00:00:00 08JUL2005:00:00:00
Here's an approach that might help. The SAS function intck() is very useful!
/* Generate a dataset as described. */
data have;
do id=1 to 10000;
orderdate = today()-(ceil(ranuni(id)*1000));
shipdate = orderdate + ceil(ranuni(id)*10);
output;
end;
format orderdate shipdate date9.;
run;
/* Use intck() to count the number of days or weekdays between order and delivery dates */
data want;
set have;
weekdays_passed = intck('weekday7w',orderdate,shipdate);
abs_days_passed = intck('day',orderdate,shipdate);
run;
/* Two ways to obtain average delivery time in days. */
/*1*/
proc univariate data=want;
var weekdays_passed
abs_days_passed;
run;
/*2*/
proc sql;
select avg(weekdays_passed) as avg_weekdays_passed,
avg(abs_days_passed) as avg_abs_days_passed
from want;
quit;

Finding last Sunday and going 4 weeks backward every week in SAS

I have a SAS job that runs every Thursday, but sometime it need to run on Wednesday, and maybe Tuesday evening. The job collects some data in 4 week intervals up until the closest Sunday. For example, today we have 19Mar2015, and I need data until 15Mar2015.
data get_some_data;
set all_the_data;
where date >= '16Feb2015' and date <= '15Mar2015';
run;
Next week I have to manually change the date parameters too
data get_some_data;
set all_the_data;
where date >= '23Feb2015' and date <= '22Mar2015';
run;
Anyway I can automate this?
I'll expand on the suggestion from #user667489 as it could take you a while to work it out. The key is to use the week time interval, which by default starts on a Sunday (you can change this with a shift index, read this for further details)
So your query just needs to be :
where intnx('week',today(),-4)<date<=intnx('week',today(),0);
Use the INTNX function to regress the date back to last Sunday:
data get_some_data;
set all_the_data;
lastsun=intnx('week',today(),0);
/*where date >= '23Feb2015' and date <= '22Mar2015';*/
where date between lastsun-27 and lastsun;
run;
You can try getting the last sunday date using weekday function and then using INTNX get the 4 week back date from that sunday date. Check the below ref code :
data mydata;
input input_date YYMMDD10.;
/* Loop to get the last sunday date, do the processing
and get out of loop */
do i =0 to 7 until(last_sunday_date>0);
/* Weekday Sunday=1 */
if weekday(sum(input_date,-i))=1 then do;
last_sunday_date=sum(input_date,-i);
/* INTNX to get the 4 week back date */
my_4_week_start=intnx('week',last_sunday_date,-4);
end;
end;
format input_date last_sunday_date my_4_week_start yymmdd10.;
datalines4;
2015-03-01
2015-03-07
2015-03-14
2015-03-21
2015-03-28
2015-04-05
2015-04-13
2015-04-20
;;;;
run;
proc print data=mydata;run;
let me know if this helps!

removing day portion of date variable for time series SAS

I'm having some frustration with dates in SAS.
I am using proc forecast and am trying make my dates spread evenly. I did some pre-processing wiht proc sql to get my counts by month but my dates are incorrect.
Though my dataset looks good (b/c I used format MONYY.) the actual value of that variable is wrong.
date year month count
Jan10 2010 1 100
Feb10 2010 2 494
...
..
.
The Date value is actually the full SAS representation of the date (18267), meaning that it includes the day count.
Do I need to convert the variable to a string and back to a date or is there a quick proc i can run?
My goal is to use the date variable with proc forecast so I only want Month and year.
Thanks for any help!
You can't define a date variable in SAS (so the number of days passed from 1jan1960) excluding the day.
What you can do is to hide the day with a format like monyy. but the underlying number will always contain that information.
Maybe you can use the interval=month option in proc forecast?
Please add some detail about the problem you're encountering with the forecast procedure.
EDIT: check this example:
data past;
keep date sales;
format date monyy5.;
lu = 0;
n = 25;
do i = -10 to n;
u = .7 * lu + .2 * rannor(1234);
lu = u;
sales = 10 + .10 * i + u;
date = intnx( 'month', '1jul1991'd, i - n );
if i > 0 then output;
end;
run;
proc forecast data=past interval=month lead=10 out=pred;
var sales;
id date;
run;