I have three lists of macros that have been created by a program, and I'm looking for an easy way to eliminate any dupes from the lists without looping back through the process that created them.
%let a=(1, 2, 3, 4);
%let b=(2, 8, 12);
%let c=(1, 3, 5, 7);
What I want is three new variables that have any overlap values eliminated, like this:
%let a_mod=(4);
%let b_mod=(8, 12);
%let c_mod=(5, 7);
I know there is probably a fairly straightforward way to do this in SAS. Any thoughts? Thanks.
What #Joe said in his comment is correct. Macro variables shouldn't be used to hold data; that's what datasets are for. Any solution that directly addresses the problem using macro functions and logic is likely to be fragile and error prone. Here is an approach that uses datasets. It's not entirely inelegant, at least in my opinion.
In your comment you mentioned that it's possible for you to store the values as datasets prior to creating the macro lists. That's great! I'll begin by assuming you have three datasets, a, b, and c, containing the values in a common variable called v.
First you want to join all three datasets.
proc sql;
create table abc as
select a.v as a,
b.v as b,
c.v as c
from a
full join b
on a.v = b.v
full join c
on a.v = c.v
quit;
In the resulting dataset, here called abc, you have columns a, b, and c, each containing the values from the dataset of the same name. Since the datasets were full joined, all of the values are present. Then in any given row, if only one of a, b, and c are non-missing, you have a non-duplicate.
Now you can separate the values back out, omitting duplicates, akin to your a_mod, etc. macro lists.
data a_mod(keep=a) b_mod(keep=b) c_mod(keep=c);
set abc;
if n(a, b, c) = 1 then do;
if a ne . then output a_mod;
else if b ne . then output b_mod;
else if c ne . then output c_mod;
end;
run;
The n() function counts the number of non-missing arguments.
Now if you need lists again for whatever the final use may be, you can recreate them using the into: clause in proc sql, for example:
proc sql noprint;
select a
into: a_mod
from a_mod
separated by ', ';
quit;
%let a_mod = (&a_mod); /* To get the surrounding parentheses back */
While this will get you what you want in the case of just a few lists like in your question, it's good to note that a simple approach such as this one gets ugly and/or annoying if you have a ton of lists.
Related
I have a directory of csv files, each with names that begin with the letter m and end with a number. There are twelve files - m6 to m17.
I'd like to read them in and process them as separate data sets. I've written two macros attempting to do so. Macro1 works. Macro2 breaks. I would prefer Macro2 if I can get it to work, to avoid unnecessary bits like my creation of %rawfiles, invocation of %sysfunc, etc.
Macro 1:
%let rawcsv = C:\ALL\dat\;
%let rawfiles = m6 m7 m8 m9 m10 m11 m12 m13 m14 m15 m16 m17;
%macro1;
%do i = 1 %to %sysfunc(countw(&rawfile));
%let rawfile = %scan(&rawfiles, &i);
proc import datafile="&&rawcsv.&&rawfile.csv"
out=&rawfile replace
dbms=csv;
guessingrows=500;
run;
%end;
%mend;
%macro1;
Macro 2:
%let rawcsv = C:\ALL\dat\;
%macro macro2(first=6, last=19);
%do i=&first. %to &last. %by 1;
proc import datafile="&&rawcsv..m&&i.csv"
out=m&i replace
dbms=csv;
guessingrows=500;
run;
%end;
%mend;
%macro2;
%macro2 is my bad imitation of this solution. It returns the following errors:
MPRINT(MACRO2): proc import datafile="C:\ALL\dat\m.6.csv" out=m.6 replace
dbms=csv;
MPRINT(MACRO2): ADLM;
MPRINT(MACRO2): guessingrows=500;
MPRINT(MACRO2): run;
ERROR: Library name is not assigned. /*repeats this error 14 times, once per file*/
Two questions:
What am I missing in %macro2?
Do you see a better solution that I am not using? The files are structured differently and not stackable, just a heads up.
From your log we can see a period is being inserted into the output dataset name. Just remove that extra period in your macro definition.
MPRINT(MACRO2): proc import datafile="C:\ALL\dat\m.6.csv" out=m.6 replace dbms=csv;
The extra & in the code is probably confusing you. When the macro processor sees two & it converts them to one and then reprocesses the string to further resolve the resulting macro variable references.
The period after a macro variable name is not required when the macro processor can tell that the name has ended. But the periods are needed in some places.
One place in your code is where it is required to make sure the macro processor knows where the name ends (the macro variable is named readcsv not readcsvm ). Another is where you want to place an actual period after the value of a macro variable. You will need to place two periods there since the first will be used by the macro processor when it evaluates the macro variable value.
In this version of macro2 I have removed the periods after the macro variable names in the places where they are not required just to emphasize the places where the period is required.
%let rawcsv = C:\ALL\dat\;
%macro macro2(first, last);
%local i ;
%do i=&first %to &last ;
proc import dbms=csv
datafile="&rawcsv.m&i..csv"
out=m&i replace
;
guessingrows=500;
run;
%end;
%mend macro2;
%macro2(first=6, last=19)
Small typo here, you need to use an & in front of LAST not the %.
%do i=&first. %to %last. %by 1;
Should be:
%do i=&first. %to &last. %by 1;
Unless you're using a separate macro called last to determine your end of the loop. But in that case you likely wouldn't also have a parameter called last.
If you're looking for alternate options I usually recommend reading all at once using a data step or CALL EXECUTE instead of macro loops as they're infinitely easier to debug in my opinion.
https://communities.sas.com/t5/SAS-Communities-Library/How-do-I-write-a-macro-to-import-multiple-text-files-that-have/ta-p/223627
I can't find the solution for this simple problem: I want to add a colum/variable in my data set. This variable will always have the same value, stored in the macro variable &value. And I am in a macro so I don't know if it change anything... This is the step before merging 2 data step.
So far, here's what I have:
%do i=1 %to 10;
data &new_data_set;
set &new_data_set;
Nom_controle=&Nom_Controle;
Partenaire=&Partenaire;
run;
%end;
I'm trying to add to my data-set (which was previously defined in the macro as &new_data_set) a column/variable named "Nom_Controle" which always takes the value stored in the macro variable &Nom_controle (previously defined too). I'm also trying to add a second column/variable named "Partenaire" which always takes the value stored in the macro variable &Partenaire (previously defined too).
Of course, as I'm posting here, my code doesn't work. Can you help me?
EDIT: after some ask me to in order to help me, here is the macro this code is from (the full thing):
%macro presence_mouvement (data_set_detail_mouvement, data_set_mouvement);
%if %sysfunc(exist(&data_set_mouvement)) AND %sysfunc(exist(&data_set_detail_mouvement)) %then %do; *Check if my data set actually exist;
%let suffix=_2;
%let new_data_set=&data_set_detail_mouvement&suffix; *Create the name of the new data set I'm going to save the result of the next proc sql in;
proc SQL noprint; *Proc to look for errors in a previous data set and print it in the new data set;
create table &new_data_set as
insert into &new_data_set
SELECT num_mouvement
FROM &data_set_detail_mouvement
EXCEPT
SELECT num_mouvement
FROM &data_set_mouvement);
%let Nom_controle=Presence_mouvement; *Creation of a new variable;
%if %sysfunc(length(&data_set_detail_mouvement))=29 %then %do; *Creation of a second variable (value conditional to the size of a previous variable);
%let Partenaire=%sysfunc(substr(&data_set_detail_mouvement, 9, 3)); %end;
%else %if %sysfunc(length(&data_set_detail_mouvement))=30 %then %do;
%let Partenaire=%sysfunc(substr(&data_set_detail_mouvement, 9, 4)); %end;
%else %do;
%let Partenaire=%sysfunc(substr(&data_set_detail_mouvement, 9, 6)); %end;
%do i=1 %to 10;
data &new_data_set;
set &new_data_set;
Nom_controle=&Nom_Controle;
Partenaire=&Partenaire;
run;
%end;
%end;*End of the actions to do in case the two data set in parameters exist;
%else %do; *Actions to do in case the two data set in parameters don't exist;
data _null_;
file print;
put #3 #10 "At least one of the data set does not exist";
run;
%end;
*This macro is aiming at pointing error in a previous data set, print them in a new data set and add two new variables/columns to this new data set (indicating their origin). The next set is going to be to merge this new data set to another one;
%mend presence_mouvement;
%presence_mouvement (sasuser.bgpi__detail_mouvement, sasuser.bgpi__mouvement);
I also wanted to say that I tested the rest of the macro before trying to add new variable so the rest of the macro shouldn't have any problem. But who knows...
Run a single data step, setting the new variables to the values setup in macro variables. If the values setup are character in nature the data step variables need to resolve those macro variables within double quotes.
data &new_data_set;
set &new_data_set;
retain
Nom_controle "&Nom_Controle"
Partenaire "&Partenaire"
;
* also works;
* Nom_controle = "&Nom_Controle";
* Partenaire = "&Partenaire";
run;
Note: The new data set variables lengths will be set to the length of the values stored in the macro variables.
A data set is a rectangle of values. It will have a certain number of rows and columns of numeric and / or character types. The SET statement in a DATA step reads one row of the table's column values into the running program data vector -- which are essentially the variables in the DATA step. A DATA step loops automatically and halts automatically on various conditions, such as the last row of a SET table being read.
I don't know why you have a macro loop %DO I=1 %TO 10. I might speculate you think you need to do this in order to 'update' 10 rows in &new_data_set.
What is it really doing ? Running the same code 10 times! Without macro the actual code run is akin to the following
data x; do r = 1 to 10; output; end; run; %* an original new_data_set;
data x; set x; z=1; run;
data x; set x; z=1; run;
data x; set x; z=1; run;
...
One additional concern is the code such as
%if %sysfunc(length(&data_set_detail_mouvement))=29 %then %do; *Creation of a second variable (value conditional to the size of a previous variable);
%let Partenaire=%sysfunc(substr(&data_set_detail_mouvement, 9, 3)); %end;
It appears you are grabbing the first 3, 4, or 6th letters of the data set name from a fully qualified libname.dataset where libname is presumed to be sasuser. A safer and more robust version could be
%let syslast = &data_set_detail_mouvement;
%let libpart = %scan(&syslast,1,.);
%let datapart = %scan(&syslast,2,.);
… extract 3, 4, or 6 preface of datapart …
%* this might be helpful;
%let Partenaire = %scan(&datapart,1,_);
Nothing seems to be wrong with the part of the code that creates the variables. There might be other issues that are difficult to tell from this extract without seeing the entire code or log. For example, if Nom_controle and Partenaire are meant to be character variables because the macro variables are characters but without quotes then there will definitely be errors. You should use symbolgen and mprint options and then post the log to help solve the problem.
I'm creating a macro variable but when use the same macro variable in my Proc Report this macro is generating a space in front of the value
Select COUNT(DISTINCT USUBJID) into: N1 from DMDD where ARMN=1;
How do I rectify it in the same code??
This is actually 'working as designed' for PROC SQL SELECT INTO. While all of the other answers are, in some ways, correct, this is a special case as opposed to normal macro variables, such as
%let var= 5 ;
%put [&var];
where that will return just
[5]
while this is not doing that. That is a behavior of PROC SQL SELECT INTO, and is intentional.
These two statements:
proc sql;
select name into :name from sashelp.class where name='Alfred';
select name into :shortname separated by ' ' from sashelp.class where name='Alfred';
quit;
%put `&name` `&shortname`;
are non-identical. separated by ' ' (or any other separated by) will always trim automatically unless notrim is included, and if you have 9.3 or newer, you have a new option, trimmed, which you can use if you intend to select a single variable. I think this behavior was introduced in 9.2 (the non-trimming of select into without a separated by, by default).
If you are only selecting a single value, adding separated by ' ' will have no impact on your result other than to cause the trimming to occur.
This is because any macro variable is stored as a character. If the source data is numeric then SAS uses the best12. format to convert to character, therefore the result is padded with leading blanks. I get around this by using the CATS function which strips out leading and trailing blanks. You can't use the LEFT or STRIP functions as these only work against character variables.
Select cats(COUNT(DISTINCT USUBJID)) into: N1 from DMDD where ARMN=1;
Use the %cmpres() macro to remove blanks.
http://support.sas.com/documentation/cdl/en/mcrolref/64754/HTML/default/viewer.htm#n0tvdbcgr9xc6dn14wmx9hpd6h51.htm
Select trim(put(COUNT(DISTINCT USUBJID), 16. -L)) into: N1 from DMDD where ARMN=1;
Use PUT() to format the output string with -L (left) alignement.
I am calling a macro inside a data step and assigning the macro variable to a data step variable as below.
The input for the macro goes from the input dataset which has some 500 records.
%macro test(inp_var);
%global macro_var;
--- using inp_var variable here---
%if --some condition-- %then call symput('macro_var',-- some value--);
%mend;
data output;
set input;
%test(inp_var);
new_data_step_var = symget('macro_var');
run;
But it's showing the error message pointing the variable new_data_step_var - ERROR 180-322: Statement is not valid or it is used out of proper order.
No SAS macro actually executes "inside" a data step. The macro language processor and data step compiler as two different subsystems that share the code input stream. They hand off to one another as they "eat" chunks of SAS code. In the case of the original program, the language processor in SAS sees the "data" statement and hands off to the data step compiler. The embedded %test macro call is detected and the code input stream is handed to the macro processor FIRST! The macro processor expands all of the code and macro logic inside of the %test macro and then the whole stream of code is handed back to the SAS data step compiler to compile.
So %test is going to run to completion BEFORE the data step even compiles.
If you are looking to make your own subroutines in data step try proc fcmp. Otherwise, just implement your conditional logic inside of the data step as was suggested.
Re-write it using datastep if/then, not macro if/then, and don't create a macro variable, simply use a datastep variable.
%MACRO TEST(var) ;
call missing(tempvar) ;
if --some condition-- then tempvar = --some value-- ;
%MEND ;
data output ;
set input ;
%TEST(inp_var) ;
new_var = tempvar ;
drop tempvar ;
run ;
You cannot use a macro variable in the same data step where you set it with call symput.
The result of your call symput statement is only available after the data step.
So at the time the symget statement is being processed, the macro variable does not exist yet.
Also, it seems rather pointless, why don't you use a retain statement to save the value you want?
e.g.:
data output;
set input;
retain new_data_step_var;
if --some condition -- then new_data_step_var = --some value--;
run;
Macros that contain a proc or a data step are not executable inside of a data step. Macros are not functions or subroutines; they are text, just as if you'd typed it out (just saving some time with loops and conditionals). So the contents of your macro need to either be text that could be executed inside a data step:
%macro mymacro(numiters);
*this macro would be easier to do in an array, but it is an example;
%local t;
%do t = 1 to &numiters.;
x&t. = mean(y&t.,z&t.);
%end;
%mend mymacro;
data output;
set input;
%mymacro(5);
run;
In that case, it is easier (and more stylistically correct) to not store a value in a macro variable. Simply contain the result in a data step variable, and if needed pass that variable's name as one of your arguments.
There are also function-style macros, that actually return a value to the data step (or in this case, return text that equates to a value). They can be used on the right side of an equal sign.
%macro xtothey(in,power);
%local t;
&in.
%do t = 1 to &power-1;
*&in.
%end;
;
%mend myfunctionmacro;
data output;
set input;
y = %xtothey(x,4);
run;
That would actually be more easily done in PROC FCMP (which compiles functions and subroutines), but sometimes macros are better for this (or you might not know FCMP well).
Finally, some macros require procs or data steps of their own. In those cases, unless you're using some FCMP elements such as DOSUBL, you will need to store the value somewhere, whether it is in a dataset or a macro. In those cases, you must run the macro prior to the datastep where you want the value - but you only get one (or a finite number of) return values. You don't get one per row unless you go to some extreme lengths, which usually can be done better without using macro variables. I would argue the below is bad form, as you almost always can do it better without using macro variables - but this is how you would do it if you needed to. FCMP with DOSUBL would probably be the superior choice.
%macro findmode(dset,var,outvar);
proc means data=&dset;
var &var.;
output out=_tempset mode(&var.)=&var._mode;
run;
data _null_;
set _tempset;
call symputx("&outvar.",&var._mode);
run;
%mend findmode;
%findmode(sashelp.class,weight,wtmode);
data output;
set input;
mode=&wtmode;
run;
I need to repeat this code many times. It is part of system-tester.
testFvB=#(fBE,fMCS,CI)
{
d='FV';
dF=strcat('testing/systemTestFiles/D_', fBE, '_', fMCS, '_', d, '.txt');
bepo(fBE,CI,fMCS,d,dF,oF);
d='B';
oF=strcat('testing/systemTestFiles/O_', fBE, '_', fMCS, '_', d, '.txt');
bepo(fBE,CI,fMCS,d,dF,oF);
};
but
Error: File: systemTester.m Line: 3 Column: 6
The expression to the left of the equals sign is not a valid target for an
assignment.
I don't know but it looks like Matlab does not accept anonymous functions of this large size. So how to use anonymous functions to encapsulate larger codes not just things like doIt=#(x) x+1? Is the only way for the encapsulation here to create a new file?
[Update] not working, possible to make this into an execution?
test=#(fBE,fMCS)for d=1:2
for CI=0:0.25:1
if d==1
d='FV';
else
d='B';
end
oF=strcat('testing/systemTestFiles/O_', fBE, '_', fMCS, '_', d, '.txt');
bepo(fBE,CI,fMCS,d,dF,oF);
end
end;
fBE='TestCase1 BE Evendist v2.txt';
fMCS='TestCase1 MCS.txt';
test(fBE,fMCS)
Anonymous functions can only contain a single executable statement.
So in your case, just create a regular M-file function.
If you are interested, there is a series of articles on Loren Shure's blog introducing functional programming style, using anonymous functions to do non-simple tasks.