SAS, define macro variable - macros

I need to define the following variables. the user need to specify the input variables in group, they can be 1, 2, or 3 variables like A, or A, B, or A, B, C. Now they also need to manually specify group_2 and group_3. But as you can see, as long as group input is fixed, then group_2 and group_3 are also fixed.
Is there any way to make the macro variable input more concise (user just need to input group, then group_1 and group_2 will automatically generated)?
%let group = A B;
%let group_2 = A, B;
%let group_3 = A trimmed, : B trimmed;
%let group = A B C;
%let group_2 = A, B, C;
%let group_3 = A trimmed, : B trimmed, : C trimmed;

Presuming the SAS variable names are standard names you can use TRANWRD to transform a space separated list of items into a more complicated form.
compbl replaces repeated spaces with a single space
tranwrd is used to replace the space separating the items with a consistent complexifying injection
%let list = %sysfunc(compbl(&list));
%let list_csv = %sysfunc(tranwrd(&list,%str( ),%str(,)));
%let list_into = :%sysfunc(tranwrd(&list,%str( ),%str( trimmed, :))) trimmed;
%put NOTE: &=list;
%put NOTE: &=list_csv;
%put NOTE: &=list_into;

Related

Calling the column values when the column names are date macro variables

In SAS, I have a dataset which has 5 columns and 4 rows. The column names are date macro variables.
I want to subtract the values in one column from another. (Date of column 4 - date of column 3) doesn't work. This subtracts the date itself and not the values in those columns.
How do I call the values of the columns?
Please help.
Example-- There are five columns-12/1/2019,12/1/2020,12/1/2021,12/1/2022 12/1/2023 and four rows-A,B,C,D and some values are stored in them.
In the above table, I want to add a column which prints the difference between the values on all dates for all the rows (A,B,C,D).
Also sim_date= 12/1/20, f_starting=12/1/2019, f_1=12/1/2021, f_2=12/1/2022, f_3=12/1/2023. These dates are all macro variables.
But when I write the code as
data test;
set test;
format g0 g1 g2 g3 percent5.2 ;
g0 = (&sim_date - &f_starting)/&f_starting;
g1 = (&f_1 - &sim_date)/&sim_date ;
g2 = (&f_2 - &f_1)/&f_1 ;
g3 = (&f_3- &f_2)/&f_2 ;
run;
`````
This code subtracts the two dates instead of the values stored in the dates. How do I call the values?
Use the "varname"n syntax so that SAS knows you are referring to the variable instead of the value.
data test;
set test;
format g0 g1 g2 g3 percent5.2 ;
g0 = ("&sim_date"n - "&f_starting"n)/"&f_starting"n;
g1 = ("&f_1"n - "&sim_date"n)/"&sim_date"n ;
g2 = ("&f_2"n - "&f_1"n)/"&f_1"n ;
g3 = ("&f_3"n - "&f_2"n)/"&f_2"n ;
run;
You reference a variable by its name. So if your variables are named sim_date and f_starting then your code might be:
g0 = (sim_date - f_starting)/f_starting;
If your variables are actually using those non-standard names that start with digits or have slashes or other non standard characters in them then you need to use a name literal. That is a quoted string suffixed with the letter n. So if the variables are named 2022/01/01 and 2022/02/01 for the first days of the first two months of 2022 then your code needs to look like:
g0 = ("2022/02/01"n - "2022/01/01"n)/"2022/01/01"n;
So either set the macro variables to name literals.
%let sim_date="2022/02/01"n;
%let f_starting="2022/02/01"n;
and then your current code will work.
g0 = (&sim_date - &f_starting)/&f_starting;
Or leave your macro variables with strings that match the actual variable names and convert them to name literals when you use them in your code:
%let sim_date = 2022/02/01;
%let f_starting = 2022/01/01;
g0 = ("&sim_date"n - "&f_starting"n)/"&f_starting"n;

Macro Do Until Loop in SAS

My loop is only making 1 iteration. I am supposed to create three macro variables: var1 = Month1, var2 = Month2, and var3 = Month3 if qtr = qtr1. My loop is only creating var1 = Month1 and I = 1 when I checked it with a Put statement. It is only making one iteration, so I'm not sure what I am doing wrong.
%Let qtr = qtr1;
%Macro Firstqtr(qtr);
%Let I = 1;
%If &qtr = qtr1 %then %do %until (&I > 3);
%Let var&I = Month&I;
%let I = %eval(&I + 1);
%end;
%Mend Firstqtr;
%Firstqtr(qtr);
Your %DO loop will never run given the input you made for the QTR parameter to your macro. You can turn on MLOGIC to see this.
1228 options mlogic;
1229 %Firstqtr(qtr);
MLOGIC(FIRSTQTR): Beginning execution.
MLOGIC(FIRSTQTR): Parameter QTR has value qtr
MLOGIC(FIRSTQTR): %LET (variable name is I)
MLOGIC(FIRSTQTR): %IF condition &qtr = qtr1 is FALSE
MLOGIC(FIRSTQTR): Ending execution.
If you want to pass in qtr1 as the value either hard code it in the macro call.
%Firstqtr(qtr1);
Or you could make your call pass in the macro variable you defined earlier.
%let qtr=qtr1;
%Firstqtr(&qtr);
It might make this distinction between the parameter's value and the value of an external macro variable with the same name clearer if you call the macro using named parameters. Note: you can use parameter names in the macro call even for parameters that were defined as positional in the macro definition.
%Firstqtr(qtr=&qtr);
option mprint;
%global qtr;
%Let qtr = qtr1;
%Macro Firstqtr(qtr);
%Let I = 1;
%If &qtr = &qtr %then %do %until (&I > 3);
%Let var&I = Month&I;
%let I = %eval(&I + 1);
%end;
%put &var1 &var2 &var3;
%Mend Firstqtr;
%Firstqtr(qtr);
You have to declare qtr as Global variable then only the if condition will be pass.
The issue is one of macro variable scope. qtr is defined both globally (line1) and locally (as a macro parameter) so the local (empty) one is used instead.
Try passing it through in your parameter as follows:
%Let qtr = qtr1;
%Macro Firstqtr(qtr);
%Let I = 1;
%If &qtr = qtr1 %then %do %until (&I > 3);
%global var&i;
%Let var&I = Month&I;
%put var&i=&&var&i;
%let I = %eval(&I + 1);
%end;
%mend Firstqtr;
%Firstqtr(&qtr);
Be aware that the variables you are creating would have local scope - to make them global, you declare them as such (%global statement).

recode and add prefix to sas variables

Lets's say I have a bunch of variables named the same way and I'd like to recode them and add a prefix to each (the variables are all numeric).
In Stata I would do something like (let's say the variables start with eq)
foreach var of varlist eq* {
recode var (1/4=1) (else=0), pre(r_)
}
How can I do this in SAS? I'd like to use the %DO macros, but I'm not familiar with them (I want to avoid SQL). I'd appreciate if you could include comments explaining each step!
SAS syntax for this would be easier if your variables are named using numeric suffix. That is, if you had ten variables with names of eq1, eq2, .... , eq10, then you could just use variable lists to define both sets of variables.
There are a number of ways to translate your recode logic. If we assume you have clean variables then we can just use a boolean expression to generate a 0/1 result. So if 4 and 5 map to 1 and the rest map to 0 you could use x in (4,5) or x > 3 as the boolean expresson.
data want;
set have;
array old eq1-eq10 ;
array new r_eq1-r_eq10 ;
do i=1 to dim(old);
new(i) = old(i) in (4,5);
end;
run;
If you have missing values or other complications you might want to use IF/THEN logic or a SELECT statement or you could define a format you could use to convert the values.
If your list of names is more random then you might need to use some code generation, such as macro code, to generate the new variable names.
Here is one method that use the eq: variable list syntax in SAS that is similar to the syntax of your variable selection before. Use PROC TRANSPOSE on an empty (obs=0) version of your source dataset to get a dataset with the variable names that match your name pattern.
proc transpose data=have(obs=0) out=names;
var eq: ;
run;
Then generate two macro variables with the list of old and new names.
proc sql noprint ;
select _name_
, cats('r_',_name_)
into :old_list separated by ' '
, :new_list separated by ' '
from names
;
quit;
You can then use the two macro variables in your ARRAY statements.
array old &old_list ;
array new &new_list ;
You can do this with rename and a dash indicating which variables you want to rename. Note the following only renames the col variables, and not the other one:
data have;
col1=1;
col2=2;
col3=3;
col5=5;
other=99;
col12=12;
run;
%macro recoder(dsn = , varname = , prefix = );
/*select all variables that include the string "varname"*/
/*(you can change this if you want to be more specific on the conditions that need to be met to be renamed)*/
proc sql noprint;
select distinct name into: varnames
separated by " "
from dictionary.columns where memname = upcase("&dsn.") and index(name, "&varname.") > 0;
quit;
data want;
set have;
/*loop through that list of variables to recode*/
%do i = 1 %to %sysfunc(countw(&varnames.));
%let this_varname = %scan(&varnames., &i.);
/*create a new variable with desired prefix based on value of old variable*/
if &this_varname. in (1 2 3) then &prefix.&this_varname. = 0;
else if &this_varname. in (4 5) then &prefix.&this_varname. = 1;
%end;
run;
%mend recoder;
%recoder(dsn = have, varname = col, prefix = r_);
PROC TRANSPOSE will give you good flexibility with regards to the way your variables are named.
proc transpose data=have(obs=0) out=vars;
var col1-numeric-col12;
copy col1;
run;
proc transpose data=vars out=revars(drop=_:) prefix=RE_;
id _name_;
run;
data recode;
set have;
if 0 then set revars;
array c[*] col1-numeric-col12;
array r[*] re_:;
call missing(of r[*]);
do _n_ = 1 to dim(c);
if c[_n_] in(1 2 3) then r[_n_] = 0;
else if c[_n_] in(4 5) then r[_n_] = 1;
else r[_n_] = c[_n_];
end;
run;
proc print;
run;
It would be nearly trivial to write a macro to parse almost that exact syntax.
I wouldn't necessarily use this - I like both the transpose and the array methods better, both are more 'SASsy' (think 'pythonic' but for SAS) - but this is more or less exactly what you're doing above.
First set up a dataset:
data class;
set sashelp.class;
age_ly = age-1;
age_ny = age+1;
run;
Then the macro:
%macro do_count(data=, out=, prefix=, condition=, recode=, else=, var_start=);
%local dsid varcount varname rc; *declare local for safety;
%let dsid = %sysfunc(open(&data.,i)); *open the dataset;
%let varcount = %sysfunc(attrn(&dsid,nvars)); *get the count of variables to access;
data &out.; *now start the main data step;
set &data.; *set the original data set;
%do i = 1 %to &varcount; *iterate over the variables;
%let varname= %sysfunc(varname(&dsid.,&i.)); *determine the variable name;
%if %upcase(%substr(&varname.,1,%length(&var_start.))) = %upcase(&var_start.) %then %do; *if it matches your pattern then recode it;
&prefix.&varname. = ifn(&varname. &condition., &recode., &else.); *this uses IFN - only recodes numerics. More complicated code would work if this could be character.;
%end;
%end;
%let rc = %sysfunc(close(&dsid)); *clean up after yourself;
run;
%mend do_count;
%do_count(data=class, out=class_r, var_start=age, condition= > 14, recode=1, else=0, prefix=p_);
The expression (1/4=1) means values {1,2,3,4} should be recoded into
1.
Perhaps you do not need to make new variables at all? If have variables with values 1,2,3,4,5 and you want to treat them as if they have only two groups you could do it with a format.
First define your grouping using a format.
proc format ;
value newgrp 1-4='Group 1' 5='Group 2' ;
run;
Then you can just use a FORMAT statement in your analysis step to have SAS treat your five level variable as it if had only two levels.
proc freq ;
tables eq: ;
format eq: NEWGRP. ;
run;

Get the ith word in a macro variable list

%let TableList = TableA TableH TableB TableG;
Words in &TableList are separated by ' '.
How can I retrieve certain word to do the following?
I do not know the number of words in the tablelist and would like to get the nth word from the list.
Given i = 4,
data &&table&i.; /* &&table&i. will resolve to TableG */
set have;
[..];
run;
I would have done the same %sysfunc(scan) trick as #mjsqu and as to answer your remaining question - of getting the last word because you don't know the number of words in the list, the easiest way I can think of is using array like below
%let all=word1 word2 word3 word4 word5;
%macro test;
data _NULL_;
array x[*] &all.;
Num=dim(x);
call symput("Num_of_words",num);
run;
%mend;
%test;
Now you know the total number of words so can find out the last word as well.
The short answer is to use the %scan function:
%put %scan(&tablelist,4,%str( ));
The third argument specifies that %scan should count only spaces as delimiters. Otherwise, it will also treat all of the following characters as delimiters by default:
. < ( + & ! $ * ) ; ^ - / , % |
Given the list you have, you can use a %do loop to add the macro variables to a list:
/* initialise a counter macro variable */
%let k = 1;
/* iterate through tablelist until a value is not found */
%do %until (%scan(&tablelist,&k,%str( )) = );
%let table&k = %scan(&tablelist,&k,%str( ));
%let k = &k + 1;
%end;
%let i = 4;
%put &&table&i;
N.B. this code only works inside a macro definition (that is a block of code delimited by %macro and %mend statements.
If you're doing this for the purpose of selecting on the fly one word from the list, you should just make a macro, not try to set up macro variables. Too much extra work to do all that business to make the various macro variables versus a one-line macro.
%let tableList=TableA TableB TableC TableD;
%macro selectTable(k=);
%scan(&tablelist,&k)
%mend selectTable;
data %selectTable(k=4);
set sashelp.class;
run;

SAS: put format in macro

I am trying to create a new variable by assigning a format to an existing variable. I'm doing this from within a macro. I'm getting the following error: ": Expecting a format name." Any thoughts on how to resolve? Thanks!
/* macro to loop thru a list of vars and execute a code block on each. This is working fine. */
%macro iterlist
(
code =
,list =
)
;
%*** ASSIGN EACH ITEM IN THE LIST TO AN INDEXED MACRO VARIABLE &&ITEM&I ;
%let i = 1;
%do %while (%cmpres(%scan(&list., &i.)) ne );
%let item&i. = %cmpres(%scan(&list., &i.));
%let i = %eval((&i. + 1);
%end;
%*** STORE THE COUNT OF THE NUMBER OF ITEMS IN A MACRO VARIABLE: &CNTITEM;
%let cntitem = %eval((&i. - 1);
%*** EXPRESS CODE, REPLACING TOKENS WITH ELEMENTS OF THE LIST, IN SEQUENCE;
%do i = 1 %to &cntitem.;
%let codeprp = %qsysfunc(tranwrd(&code.,?,%nrstr(&&item&i..)));
%unquote(&codeprp.)
%end;
%mend iterlist;
/* set the list of variables to iterate thru */
%let mylist = v1 v2 v3 v4;
/* create a contents table to look up format info to assign in macro below*/
proc contents data=a.recode1 noprint out=contents;
run;
/* macro to create freq and chisq tables for each var */
%macro runfreqs (variabl = );
proc freq data=a.recode1 noprint ;
tables &variabl.*improved /out=&variabl._1 chisq;
output out=&variabl.chisq n pchi ;
run;
/* do some more stuff with the freq tables, then grab format for variable from contents */
data _null_;
set contents;
if name="&variabl." then CALL SYMPUT("classformat", format);
run;
data &variabl._3;
length classvalue $ 30 ;
set &variabl._2; ;
/* output a new var using the macro variable for format that we pulled from contents above. Here's where the error occurs. */
classvalue=put(class, %quote(&classformat.));
run;
%mend runfreqs;
* run the macro, iterating thru var list and creating freq tables;
%ITERLIST(list = &mylist., code = %nrstr(%runfreqs(variabl = ?);));
Just guessing, the line
classvalue=put(class, %quote(&classformat.));
should be
classvalue=put(class, &classformat..);
Two points because one is "eaten" by macro processor to mark end of macro variable name, the second one is needed to complete format name.
I believe you won't need %quote() in your case - format name cannot contain strings quoted by %quote().
EDIT: Again not tried, just based on the code I see you also need to change CALL SYMPUT("classformat", format);
to CALL SYMPUTX("classformat", format);
CALL SYMPUTX() is advanced version of CALL SYMPUT(), it removes trailing blanks in macro variable value while the original version keeps blanks. Effectively this will be same as your solution, just simpler.
So the problem is indeed with extra blanks between format name and the period.
No idea why this works and vasja's idea wouldn't, but the problem was clearly with the period on the end of the format name (or perhaps some extra white space?). I changed the data step to add the period before the SYMPUT call:
data _null_;
set contents;
myformat=catt(format,'.');
if name="&variabl." then CALL SYMPUT("classformat", myformat);
run;