i need to separate every date that I create in the next sentence so I'm able to create a variable for every date.
The next loop create dates in one variable.
data _null_;
End_date= /*today()-1;*/ '30JUN2021'd;
call symput('days',intck('day',intnx("year",End_date,-1,'b'), End_date));
End_month=intnx("month",End_date,0,'e');
format End_month End_date ddmmyy10.;
run;
data _null_;
length varnew $30000.;
End_date= /*today()-1;*/ '30JUN2021'd;
do i=0 to &days.;
if i=0 then do;
varnew=put(intnx("year",End_date,-1,'b'),ddmmyy10.);
end;
else do;
if intnx("year",End_date,-1,'b')+i ne intnx("month",intnx("year",End_date,-1,'b')+i,0,'e') then do;
varnew= trim(varnew)||" "||put(intnx("year",End_date,-1,'b')+i,ddmmyy10.);
end;
else do;
varnew= trim(varnew)||" "||put(intnx("year",End_date,-1,'b')+i,ddmmyy10.)||" "||put(intnx("year",End_date,-1,'b')+i,mmyys7.);
end;
end;
end;
if End_date ne intnx("month",end_date,0,'e') then do;
call symput('varnew',trim(varnew)||" "||put(intnx("year",End_date,-1,'b')+&days.,mmyys7.)||" "||put(intnx("year",End_date,-1,'b')+&days.,mmyys7.)||"*");
end;
else do;
call symput('varnew',trim(varnew)||" "||put(intnx("year",End_date,-1,'b')+&days.,mmyys7.)||"*");
end;
run;
%put --&varnew.--;
That looks like
--01/01/2020 02/01/2020 03/01/2020 04/01/2020 05/01/2020 06/01/2020 07/01/2020 08/01/2020 09/01/2020 10/01/2020 11/01/2020
12/01/2020 13/01/2020 14/01/2020 15/01/2020 16/01/2020 17/01/2020 18/01/2020 19/01/2020 20/01/2020 21/01/2020 22/01/2020 23/01/2020
24/01/2020 25/01/2020 26/01/2020 27/01/2020 28/01/2020 29/01/2020 30/01/2020 31/01/2020 01/2020 01/02/2020 02/02/2020 03/02/2020
04/02/2020 05/02/2020 06/02/2020
So what I want is to create for every date a variable to look like
| VAR1 | VAR2 | VAR3 |
|-------------| ------------|-------------|
| 01/01/2020 | 02/01/2020 | 03/01/2020 |
You can do this all in one step. Set your end date and start date, then loop through and create a macro variable for each one. The first argument of call symputx() accepts functions as well. We'll create a counter for each variable, then append it to the name var.
The output statement and dataset created are not needed and are here for descriptive purposes only.
data dates;
end_date = '30JUN2021'd;
start_date = intnx('year', end_date, -1, 'B');
do date = start_date to end_date;
i = date-start_date+1;
output;
/* Dynamically create new macro variables named var1, var2, etc. */
call symputx(cats('var', i), put(date, ddmmyys10.) );
end;
format end_date start_date date ddmmyys10.;
run;
%put var1: &var1;
%put var2: &var2;
%put var3: &var3;
Output:
var1: 01/01/2020
var2: 02/01/2020
var3: 03/01/2020
If you're trying to create data step variables where you have one variable for each date, you can do a similar thing with arrays. You can't dynamically assign an array size, so we'll set our end/start dates and array size using macro variables.
/* Create date constants for array */
%let end_date = '30JUN2021'd;
%let start_date = %sysfunc(intnx(year, %sysfunc(putn(&end_date, 8.)), -1, B));
%let n_dates = %eval(%sysfunc(putn(&end_date., 8.)) - &start_date.+1);
/* Generate dates for each array variable */
data dates;
end_date = &end_date.;
start_date = &start_date.;
array var[&n_dates.];
do date = start_date to end_date;
i = date-start_date+1;
var[i] = date;
end;
format start_date end_date var: ddmmyys10.;
run;
Output:
var1 var2 var3
01/01/2020 02/01/2020 03/01/2020
Related
I hope you can assist.
I have a SAS data set which has two columns, ID and Date which looks like this:
In some instances, the date column skips a month. I need a code which will create the missing date for each ID e.g. for AY273, I need a code that will create date 2022/11/20 and for WG163, 2022/12/15.
You can merge the data with itself shifted one observation forward (to get a lead value) and loop across that range.
Example:
data have;
input id $ date yymmdd10.;
format date yymmdd10.;
datalines;
AAAAA 2021-11-20
AY273 2022-10-20
AY273 2022-12-20
AY273 2023-01-20
WG163 2022-10-15
WG163 2022-11-15
WG163 2023-01-15
ZZZZZ 2022-01-15
;
data want(keep=id date fillflag);
merge have have(rename=(date=leaddate id=leadid) firstobs=2);
if id eq leadid then
do while (intck('month',date,leaddate) > 0);
output;
date = intnx('month',date,1,'sameday');
fillflag = 1;
end;
else
output;
run;
Try this
data WANT (drop = this_date last_date);
set HAVE(rename=(date = this_date));
by id;
last_date = lag(this_date);
if first.id then do;
date = this_date;
output;
end;
else do date = this_date to last_date + 16 by -30;
output;
end;
format date yymmdd10.;
proc sort;
by id date;
run;
If it does not work, I will correct it.
I am facing an issue, while importing an excel file into sas environment. So basically in the Excel file there are few columns named as
Geography
AR_NO
31-Jan-18
28-Feb-18
31-Mar-18
30-Apr-18
31-May-18
30-Jun-18
After using below the code - >
%macro FX_Lkup(sheet);
FILENAME FXFILE "/idn/home/Module2/excel.xlsx";
PROC IMPORT DATAFILE=FXFILE
DBMS=XLSX
OUT=&sheet.
REPLACE
;
SHEET="&sheet.";
RUN;
%mend FX_Lkup;
%FX_Lkup(LENDING_TEMPLATE);
%FX_Lkup(2018FXRates);
SAS data print the columns as
Geography
AR_NO
43131
43159
43190
43220
and so on.
Does any have solution on that? Any lead would be really appreciated : )
Thanks !
It is correctly imported, SAS uses numbers to store dates. in order to have a date in your final table, you need to declare format = AFRDFDE7. for instance
If you have mixed character and numeric values in the same column then SAS will be forced to create the variable as character. When it does that it stores the number that Excel uses for the date as a string of digits. To convert it to a date in SAS first convert the string of digits to a number and then adjust the number to account for the difference in how SAS and Excel count days.
data want ;
set LENDING_TEMPLATE ;
date = input(geography,??32.) + '30DEC1899'd;
format date date9.;
run;
Dates as headers
If your excel file is using dates as column headers then SAS will also convert them to digit strings since variable names are always characters strings, not numbers. One quick way to fix it is to use PROC TRANSPOSE. This will be easy when each row is uniquely identified by the other variables and when all of the "date" variables are numeric.
proc transpose data=LENDING_TEMPLATE out=tall ;
by geography ar_no ;
run;
data tall ;
set tall ;
date = input(_name_ , 32.) + '30DEC1899'd ;
format date date9. ;
drop _name_;
run;
You could stop here as now you have a useful dataset where the date values are in a variable instead of hiding in the metadata (variable name).
To get back to your original wide layout just add another PROC TRANSPOSE and tell it to use DATE as the ID variable.
proc transpose data=tall out=wide ;
by geography ar_no ;
id date;
var col1;
run;
IMPORT is using Excel date valued column headers as Excel epoch date numbers.
Use Proc DATASETS to change the column label, and possibly rename the columns to something generic such as DATE1-DATE6. Or, continue on, and further unpivot the data into a categorical form with columns GEO, AR_NO, DATE, VALUE
You might be asking yourself "Where do those numbers, such as 43131, come from?", or "What is an Excel epoch date number?"
They are unformatted Excel date values. The human readable date represented by a number is determined by a systems epoch, or the date represented by the number 0.
Different systems of time keeping have different epochs (starting points) and time units. Some examples:
21DEC1899 Excel datetime number 0, 1 = 1 day
01JAN1960 SAS date number 0, 1 = 1 day
01JAN1960 SAS datetime number 0, 1 = 1 second
01JAN1970 Unix OS datetime number 0, 1 = 1 second
To convert an Excel date number to a SAS date number you need to subtract 21916, which is the number of days from 31DEC1899 to 01JAN1960
This understanding of date epochs will be used when setting the label of a SAS column and renaming the column.
For others fiddling with code, the following will create an Excel worksheet having Date valued column headers. I speculate such a situation can otherwise arise when importing a worksheet containing an Excel pivot table.
First create some sample SAS data
data demo_tall;
do Geography = 'Mountains', 'Plains';
do AR_NO = 1 to 3;
_n_ = 0;
length label $200;
do label = '31-Jan-18', '28-Feb-18', '31-Mar-18',
'30-Apr-18', '31-May-18', '30-Jun-18'
;
_n_ + 1;
name = cats('Date',_n_);
value + 1;
output;
end;
end;
end;
run;
proc transpose data=demo_tall out=demo_wide(drop=_name_);
by Geography AR_NO;
var value;
id name;
idlabel label;
run;
Sample SAS data set (pivoted with transpose)
Then create Excel sheet with Excel date valued and formatted column headers
ods noresults;
ods excel file='%TEMP%\across.xlsx' options(sheet_name='Sample');
data _null_;
declare odsout xl();
if 0 then set demo_wide;
length _name_ $32;
xl.table_start();
* header;
xl.row_start();
do _n_ = 1 to 100; * 100 column guard;
call vnext(_name_);
if _name_ = '_name_' then leave;
_label_ = vlabelx(_name_);
_date_ = input(_label_, ?? date9.);
* make some header cells an Excel date formatted value;
if missing(_date_) then
xl.format_cell(data:_label_);
else
xl.format_cell(
data:_date_,
style_attr:"width=9em tagattr='type:DateTime format:dd-mmm-yy'"
);
end;
xl.row_end();
* data rows;
do _n_ = 1 by 1 while (not lastrow);
set demo_wide end=lastrow;
xl.row_start();
call missing(_name_);
do _index_ = 1 to 100; * 100 column guard;
call vnext(_name_);
if _name_ = '_name_' then leave;
xl.format_cell(data:vvaluex(_name_));
end;
xl.row_end();
end;
xl.table_end();
stop;
run;
ods excel close;
ods results;
Excel file created
IMPORT Excel worksheet
Log will show the 'funkiness' of date valued column headers
options msglevel=I;
proc import datafile='%temp%\across.xlsx' dbms=xlsx replace out=want;
sheet = "Sample";
run;
proc contents noprint data=want out=want_meta(keep=name label varnum);
run;
----- LOG -----
1380 proc import datafile='%temp%\across.xlsx' dbms=xlsx replace out=want;
1381 sheet = "Sample";
1382 run;
NOTE: Variable Name Change. 43131 -> _43131
NOTE: Variable Name Change. 43159 -> _43159
NOTE: Variable Name Change. 43190 -> _43190
NOTE: Variable Name Change. 43220 -> _43220
NOTE: Variable Name Change. 43251 -> _43251
NOTE: Variable Name Change. 43281 -> _43281
NOTE: VARCHAR data type is not supported by the V9 engine. Variable Geography has been converted
to CHAR data type.
NOTE: The import data set has 6 observations and 8 variables.
NOTE: WORK.WANT data set was successfully created.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
Modify the header (metadata) of the imported data set
Date valued column names will be renamed DATE1-DATE6 and the label will be changed to be the corresponding date in SAS format DATE11. (dd-mon-yyyy)
%let renames=;
%let labels=;
data _null_;
length newname $32;
length renames labels $32767;
retain renames labels;
set meta end=lastvar;
date = coalesce(input(label, ?? 5.),1e9) + '31dec1899'd;
if '01jan1980'd < date < today() then do;
index + 1;
newname = cats('DATE',index);
label = quote(trim(put(date,date11.)));
labels = catx(' ', labels, catx('=',name,label));
renames = catx(' ', renames, catx('=',name,newname));
end;
if lastvar;
if not missing(labels) then call symput('labels', trim('LABEL ' || labels));
if not missing(renames) then call symput('renames', trim('RENAME ' || renames));
run;
proc datasets nolist lib=work;
modify want;
&labels;
&renames;
run;
quit;
%symdel labels renames;
%let syslast = want;
The result, when printed.
Optional
Unpivot to a categorical form (tall layout)
proc transpose data=want out=stage1(rename=(col1=value _label_=date_string));
by geography ar_no;
var date:;
label _name_ = ' ';
label date_string = ' ';
run;
data want_tall;
set stage1;
date = input (date_string, date11.);
format date date11.;
keep geography ar_no _name_ date value;
run;
I am am given 2 dates, a start date and an end date.
I would like to know the date of the first 35 day period, then each subsequent 30 day period.
I have;
start end
22-Jun-15 22-Oct-15
9-Jan-15 15-May-15
I want;
start end tik1 tik2 tik3 tik4
22-Jun-15 22-Oct-15 27-Jul-15 26-Aug-15 25-Sep-15
9-Jan-15 15-May-15 13-Feb-15 15-Mar-15 14-Apr-15 14-May-15
I am fine with the dates calculations but my real issue is creating a variable and incrementing its name. I decided to include my whole problem because I thought it might be easier to explain in its context.
You can solve the problem via following logic:
1) Determining number of columns to be added.
2) Calculating the values for the columns basis the requirement
data test;
input start end;
informat start date9. end date9.;
format start date9. end date9.;
datalines;
22-Jun-15 22-Oct-15
09-Jan-15 15-May-15
;
run;
/*******Determining number of columns*******/
data noc_cal;
set test;
no_of_col = floor((end-start)/30);
run;
proc sql;
select max(no_of_col) into: number_of_columns from noc_cal;
run;
/*******Making an array where 1st iteration(tik1) is increased by 35days whereas others are incremented by 30days*******/
data test1;
set test;
array tik tik1-tik%sysfunc(COMPRESS(&number_of_columns.));
format tik: date9.;
tik1 = intnx('DAYS',START,35);
do i= 2 to %sysfunc(COMPRESS(&number_of_columns.));
tik[i]= intnx('DAYS',tik[i-1],30);
if tik[i] > end then tik[i]=.;
end;
drop i;
run;
Alternate Way (incase you dont want to use proc sql)
data test;
input start end;
informat start date9. end date9.;
format start date9. end date9.;
datalines;
22-Jun-15 22-Oct-15
09-Jan-15 15-May-15
;
run;
/*******Determining number of columns*******/
data noc_cal;
set test;
no_of_col = floor((end-start)/30);
run;
proc sort data=noc_cal;
by no_of_col;
run;
data _null_;
set noc_cal;
by no_of_col;
if last.no_of_col;
call symputx('number_of_columns',no_of_col);
run;
/*******Making an array where 1st iteration(tik1) is increased by 35days whereas others are incremented by 30days*******/
data test1;
set test;
array tik tik1-tik%sysfunc(COMPRESS(&number_of_columns.));
format tik: date9.;
tik1 = intnx('DAYS',START,35);
do i= 2 to %sysfunc(COMPRESS(&number_of_columns.));
tik[i]= intnx('DAYS',tik[i-1],30);
if tik[i] > end then tik[i]=.;
end;
drop i;
run;
My output:
> **start |end |tik1 | tik2 |tik3 |tik4**
> 22Jun2015 |22Oct2015 |27Jul2015| 26Aug2015|25Sep2015|
> 09Jan2015 |15May2015 |13Feb2015| 15Mar2015|14Apr2015|14May2015
I tend to prefer long vertical structures. I would approach it like:
data want;
set have;
tik=start+35;
do while(tik<=end);
output;
tik=tik+30;
end;
format tik mmddyy10.;
run;
If you really need it wide, you could transpose that dataset in a second step.
I have the following two macro variables:
%let start_date = 29MAY2014;
%let end_date = 15JUL2014;
I would like to create a dataset which is a series of dates between these (inclusive.) I cannot change the input format of the macro variables &start_date and &end_date.
I have tried many variations of the following, but SAS spits out an error for each:
data base_dates;
do date = put("&start_date",date9.) to put("&end_date",date9.);
output;
end;
format date date11.;
run;
Any help in this would be much appreciated
Use them as date literals, enclose in quotes and add a d at the end.
Do date = "&start_date"d to "&end_date"d;
It was simple; input() instead of put()
data base_dates;
do date = input("&start_date",date9.) to input("&end_date",date9.);
output;
end;
format date date11.;
run;
I want to convert a character date variable (categorical) in the format 9/12/1990, 10/1/1990, etc. into this format: 09/12/1990, 10/01/1990, etc. (mmddyy10.) using SAS.
format date_new mmddyy10.;
date_new =input(trim(VAR1),mmddyy10.);
The code is not working.
Try this:
data new;
set old;
/* Parse character date components */
array dt[*] month day year;
do i = 1 to 3;
dt[i] = input(scan(var1, i, "/"), best.);
end;
/* Recontruct date */
date_new = mdy(month, day, year);
format date_new mmddyy10.;
run;
Documentation links:
mdy
scan
SAS dates
Arrays