removing day portion of date variable for time series SAS - date

I'm having some frustration with dates in SAS.
I am using proc forecast and am trying make my dates spread evenly. I did some pre-processing wiht proc sql to get my counts by month but my dates are incorrect.
Though my dataset looks good (b/c I used format MONYY.) the actual value of that variable is wrong.
date year month count
Jan10 2010 1 100
Feb10 2010 2 494
...
..
.
The Date value is actually the full SAS representation of the date (18267), meaning that it includes the day count.
Do I need to convert the variable to a string and back to a date or is there a quick proc i can run?
My goal is to use the date variable with proc forecast so I only want Month and year.
Thanks for any help!

You can't define a date variable in SAS (so the number of days passed from 1jan1960) excluding the day.
What you can do is to hide the day with a format like monyy. but the underlying number will always contain that information.
Maybe you can use the interval=month option in proc forecast?
Please add some detail about the problem you're encountering with the forecast procedure.
EDIT: check this example:
data past;
keep date sales;
format date monyy5.;
lu = 0;
n = 25;
do i = -10 to n;
u = .7 * lu + .2 * rannor(1234);
lu = u;
sales = 10 + .10 * i + u;
date = intnx( 'month', '1jul1991'd, i - n );
if i > 0 then output;
end;
run;
proc forecast data=past interval=month lead=10 out=pred;
var sales;
id date;
run;

Related

how do i write in sas a filter that subtracts dates

Hello Stackoverflow community... hope you can help with this sas question.
I need to create a filter for a table which gives back only those records that are active from last year forward.
I would like to obtain something like :
data want;
set have;
where expire_date >= current(date) - 1year:
run;
the format of the expire_date column is 03MAY2022 (date9. format)... I tried to transform the date into a number and then subtracting 365, but i guess there is a better solution.
can someone illuminate me?
thanks in advance
I think you are searching for the INTNX() function:
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.2/lefunctionsref/p10v3sa3i4kfxfn1sovhi5xzxh8n.htm#n1lchasgjah7ran0z2wlmsbfwdx2
For example:
data a;
format num_date num_date_minus_year DATE9.;
char_date="03MAY2022";
num_date = inputn (char_date, "DATE9.");
num_date_minus_year = intnx ('YEAR', num_date, -1, "SAME");
put num_date= num_date_minus_year=;
run;
Output:
num_date=03MAY2022 num_date_minus_year=03MAY2021
You can get the current date using the DATE() function.
Do you want one YEAR or 365 days?
To use a date interval use the INTNX() function.
where expire_date >= intnx('year',date(),-1,'same') ;
To use a fixed number of days just subtract the number.
where expire_date >= date() - 365 ;

SAS 94 How to calculate the number of days until next record

Using SAS I want to be able to calculate the number of days between two dates where the value is the number of days until the next record.
The required output will be:
Date Num Days
10/09/2020 1
11/09/2020 1
12/09/2020 1
14/09/2020 2
15/09/2020 1
16/09/2020 1
17/09/2020 1
18/09/2020 1
20/09/2020 2
I have tried using Lag and Retain but just cant get it work.
Any advice and suggestions would be really appreciated.
If you sort the data by descending DATE then it is easier because then you just need to look backwards to find the next date. So you can use LAG() or DIF() function.
data want;
set have;
by descending date;
num_days = dif(date);
run;
To simulate a "lead" function you can set another copy of the data skipping the first observation.
data want;
set have ;
set have(firstobs=2 keep=date rename=(date=next_date)) have(obs=1 drop=_all_);
num_days = next_date - date;
run;

Reading excel file in sas with date columns

I am facing an issue, while importing an excel file into sas environment. So basically in the Excel file there are few columns named as
Geography
AR_NO
31-Jan-18
28-Feb-18
31-Mar-18
30-Apr-18
31-May-18
30-Jun-18
After using below the code - >
%macro FX_Lkup(sheet);
FILENAME FXFILE "/idn/home/Module2/excel.xlsx";
PROC IMPORT DATAFILE=FXFILE
DBMS=XLSX
OUT=&sheet.
REPLACE
;
SHEET="&sheet.";
RUN;
%mend FX_Lkup;
%FX_Lkup(LENDING_TEMPLATE);
%FX_Lkup(2018FXRates);
SAS data print the columns as
Geography
AR_NO
43131
43159
43190
43220
and so on.
Does any have solution on that? Any lead would be really appreciated : )
Thanks !
It is correctly imported, SAS uses numbers to store dates. in order to have a date in your final table, you need to declare format = AFRDFDE7. for instance
If you have mixed character and numeric values in the same column then SAS will be forced to create the variable as character. When it does that it stores the number that Excel uses for the date as a string of digits. To convert it to a date in SAS first convert the string of digits to a number and then adjust the number to account for the difference in how SAS and Excel count days.
data want ;
set LENDING_TEMPLATE ;
date = input(geography,??32.) + '30DEC1899'd;
format date date9.;
run;
Dates as headers
If your excel file is using dates as column headers then SAS will also convert them to digit strings since variable names are always characters strings, not numbers. One quick way to fix it is to use PROC TRANSPOSE. This will be easy when each row is uniquely identified by the other variables and when all of the "date" variables are numeric.
proc transpose data=LENDING_TEMPLATE out=tall ;
by geography ar_no ;
run;
data tall ;
set tall ;
date = input(_name_ , 32.) + '30DEC1899'd ;
format date date9. ;
drop _name_;
run;
You could stop here as now you have a useful dataset where the date values are in a variable instead of hiding in the metadata (variable name).
To get back to your original wide layout just add another PROC TRANSPOSE and tell it to use DATE as the ID variable.
proc transpose data=tall out=wide ;
by geography ar_no ;
id date;
var col1;
run;
IMPORT is using Excel date valued column headers as Excel epoch date numbers.
Use Proc DATASETS to change the column label, and possibly rename the columns to something generic such as DATE1-DATE6. Or, continue on, and further unpivot the data into a categorical form with columns GEO, AR_NO, DATE, VALUE
You might be asking yourself "Where do those numbers, such as 43131, come from?", or "What is an Excel epoch date number?"
They are unformatted Excel date values. The human readable date represented by a number is determined by a systems epoch, or the date represented by the number 0.
Different systems of time keeping have different epochs (starting points) and time units. Some examples:
21DEC1899 Excel datetime number 0, 1 = 1 day
01JAN1960 SAS date number 0, 1 = 1 day
01JAN1960 SAS datetime number 0, 1 = 1 second
01JAN1970 Unix OS datetime number 0, 1 = 1 second
To convert an Excel date number to a SAS date number you need to subtract 21916, which is the number of days from 31DEC1899 to 01JAN1960
This understanding of date epochs will be used when setting the label of a SAS column and renaming the column.
For others fiddling with code, the following will create an Excel worksheet having Date valued column headers. I speculate such a situation can otherwise arise when importing a worksheet containing an Excel pivot table.
First create some sample SAS data
data demo_tall;
do Geography = 'Mountains', 'Plains';
do AR_NO = 1 to 3;
_n_ = 0;
length label $200;
do label = '31-Jan-18', '28-Feb-18', '31-Mar-18',
'30-Apr-18', '31-May-18', '30-Jun-18'
;
_n_ + 1;
name = cats('Date',_n_);
value + 1;
output;
end;
end;
end;
run;
proc transpose data=demo_tall out=demo_wide(drop=_name_);
by Geography AR_NO;
var value;
id name;
idlabel label;
run;
Sample SAS data set (pivoted with transpose)
Then create Excel sheet with Excel date valued and formatted column headers
ods noresults;
ods excel file='%TEMP%\across.xlsx' options(sheet_name='Sample');
data _null_;
declare odsout xl();
if 0 then set demo_wide;
length _name_ $32;
xl.table_start();
* header;
xl.row_start();
do _n_ = 1 to 100; * 100 column guard;
call vnext(_name_);
if _name_ = '_name_' then leave;
_label_ = vlabelx(_name_);
_date_ = input(_label_, ?? date9.);
* make some header cells an Excel date formatted value;
if missing(_date_) then
xl.format_cell(data:_label_);
else
xl.format_cell(
data:_date_,
style_attr:"width=9em tagattr='type:DateTime format:dd-mmm-yy'"
);
end;
xl.row_end();
* data rows;
do _n_ = 1 by 1 while (not lastrow);
set demo_wide end=lastrow;
xl.row_start();
call missing(_name_);
do _index_ = 1 to 100; * 100 column guard;
call vnext(_name_);
if _name_ = '_name_' then leave;
xl.format_cell(data:vvaluex(_name_));
end;
xl.row_end();
end;
xl.table_end();
stop;
run;
ods excel close;
ods results;
Excel file created
IMPORT Excel worksheet
Log will show the 'funkiness' of date valued column headers
options msglevel=I;
proc import datafile='%temp%\across.xlsx' dbms=xlsx replace out=want;
sheet = "Sample";
run;
proc contents noprint data=want out=want_meta(keep=name label varnum);
run;
----- LOG -----
1380 proc import datafile='%temp%\across.xlsx' dbms=xlsx replace out=want;
1381 sheet = "Sample";
1382 run;
NOTE: Variable Name Change. 43131 -> _43131
NOTE: Variable Name Change. 43159 -> _43159
NOTE: Variable Name Change. 43190 -> _43190
NOTE: Variable Name Change. 43220 -> _43220
NOTE: Variable Name Change. 43251 -> _43251
NOTE: Variable Name Change. 43281 -> _43281
NOTE: VARCHAR data type is not supported by the V9 engine. Variable Geography has been converted
to CHAR data type.
NOTE: The import data set has 6 observations and 8 variables.
NOTE: WORK.WANT data set was successfully created.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
Modify the header (metadata) of the imported data set
Date valued column names will be renamed DATE1-DATE6 and the label will be changed to be the corresponding date in SAS format DATE11. (dd-mon-yyyy)
%let renames=;
%let labels=;
data _null_;
length newname $32;
length renames labels $32767;
retain renames labels;
set meta end=lastvar;
date = coalesce(input(label, ?? 5.),1e9) + '31dec1899'd;
if '01jan1980'd < date < today() then do;
index + 1;
newname = cats('DATE',index);
label = quote(trim(put(date,date11.)));
labels = catx(' ', labels, catx('=',name,label));
renames = catx(' ', renames, catx('=',name,newname));
end;
if lastvar;
if not missing(labels) then call symput('labels', trim('LABEL ' || labels));
if not missing(renames) then call symput('renames', trim('RENAME ' || renames));
run;
proc datasets nolist lib=work;
modify want;
&labels;
&renames;
run;
quit;
%symdel labels renames;
%let syslast = want;
The result, when printed.
Optional
Unpivot to a categorical form (tall layout)
proc transpose data=want out=stage1(rename=(col1=value _label_=date_string));
by geography ar_no;
var date:;
label _name_ = ' ';
label date_string = ' ';
run;
data want_tall;
set stage1;
date = input (date_string, date11.);
format date date11.;
keep geography ar_no _name_ date value;
run;

intck() giving negative value

I am new to SAS and I am having trouble with finding the difference between 2 dates.
I have 2 columns: checkin_date and checkout_date
the dates are in mmddyy10. format (mm/dd/yyyy).
I have used the following code:
stay_days= intck('day', checkin_day, checkout_day);
I am getting the right values for dates in the same month but wrong values for days that are across 2 months. For example, the difference between 02/06/2014 and 02/11/2014 is 5. But the difference between 1/31/2014 and 2/13/2014 is -18 which is incorrect.
I have also simply tried to subtract them both:
stay_day = checkout_day - checkin_day;
I am getting the same result for that too.
My entire code:
data hotel;
infile "XXXX\Hotel.dat";
input room_no num_guests checkin_month checkin_day checkin_year checkout_month checkout_day checkout_year internet_used $ days_used room_type $16. room_rate;
checkin_date = mdy(checkin_month,checkin_day,checkin_year);
informat checkin_date mmddyy.;
format checkin_date mmddyy10.;
checkout_date = mdy(checkout_month,checkout_day,checkout_year);
informat checkout_date mmddyy.;
format checkout_date mmddyy10.;
stay_day= intck('day', checkin_day, checkout_day);
Your problem is a typo - using wrong variables in intck() function. You are using variables "xxx_DAY" which is the DAY of month instead of the full DATE. Change to stay_day= intck('day', checkin_date, checkout_date);
Your data probably has the date values in the wrong variables. When using subtraction the order should be ENDDATE - STARTDATE. When using INTNX() function the order should be from STARTDATE to ENDDATE. In either case if the value in the STARTDATE variable is AFTER the value in the ENDDATE variable then the difference will be a negative number.
Perhaps you need to clean the data?
The only way to get -18 comparing 2014-01-31 and 2014-02-13 would be if you extracted the day of the month and subtracted them.
diff3 = day(end) - day(start);
which would be the same as subtracting 31 from 13.
Example using your dates:
data check;
input start end ;
informat start end mmddyy.;
format start end yymmdd10.;
diff1=intck('day',start,end);
diff2=end-start;
cards;
02/06/2014 02/11/2014
1/31/2014 2/13/2014
;
Results:
Obs start end diff1 diff2
1 2014-02-06 2014-02-11 5 5
2 2014-01-31 2014-02-13 13 13

Changing the values of Dates in SAS

Complete novice with SAS and I'm trying to convert a yearly range of dates to just "2014", "2015" & "2016." So for example I have an Orders column with a lots of dates in 2014, 2015 and 2016 and want to just convert the values in each year to just the name of the year. The code I was trying to use is below.
Data SortingDates;
set work.ClaraData;
if OrderDate <='31Dec2014'd then OrderDate = "2014";
if '01Jan2015'd <= OrderDate <= '31Dec2015'd then OrderDate= "2015";
if '01Jan2016'd <= OrderDate <= '31Dec2016'd then OrderDate = "2016";
run;
However this message comes: Character values have been converted to numeric values at the places given by...
Plus when printing the data, the dates all come out as 09/07/1965
The OrderDate column is properly formatted as "OrderDate Num 8 DDMMYY10. DDMMYY10."
Thanks!
You are getting the warning because you tried to assign the characters string "2014" to the numeric variable OrderDate. SAS probably successfully converted "2014" into 2,014 for you but since you didn't change the format it should display it as '07/07/1965' since that is the date that is 2,014 days since 01JAN1960.
It is probably easiest if you use the YEAR() function to get the year of a date value.
OrderYear = year(OrderDate);
But you could also just try using the YEAR. format on your existing OrderDate variable.
proc freq data=ClaraData ;
tables OrderDate ;
format OrderDate year. ;
run;
Try the year function (page 15 of this PDF): https://www.sas.com/storefront/aux/en/spfunctionxexample/62857_excerpt.pdf
Data SortingDates;
set work.ClaraData;
OrderDate = YEAR(OrderDate);
run;
Or keeping it as a date, try the year format (like page 8 of the same pdf)
Data SortingDates;
set work.ClaraData;
format OrderDate YEARw.;
run;