SAS macro loop to read csv files with proc import - import

I have a directory of csv files, each with names that begin with the letter m and end with a number. There are twelve files - m6 to m17.
I'd like to read them in and process them as separate data sets. I've written two macros attempting to do so. Macro1 works. Macro2 breaks. I would prefer Macro2 if I can get it to work, to avoid unnecessary bits like my creation of %rawfiles, invocation of %sysfunc, etc.
Macro 1:
%let rawcsv = C:\ALL\dat\;
%let rawfiles = m6 m7 m8 m9 m10 m11 m12 m13 m14 m15 m16 m17;
%macro1;
%do i = 1 %to %sysfunc(countw(&rawfile));
%let rawfile = %scan(&rawfiles, &i);
proc import datafile="&&rawcsv.&&rawfile.csv"
out=&rawfile replace
dbms=csv;
guessingrows=500;
run;
%end;
%mend;
%macro1;
Macro 2:
%let rawcsv = C:\ALL\dat\;
%macro macro2(first=6, last=19);
%do i=&first. %to &last. %by 1;
proc import datafile="&&rawcsv..m&&i.csv"
out=m&i replace
dbms=csv;
guessingrows=500;
run;
%end;
%mend;
%macro2;
%macro2 is my bad imitation of this solution. It returns the following errors:
MPRINT(MACRO2): proc import datafile="C:\ALL\dat\m.6.csv" out=m.6 replace
dbms=csv;
MPRINT(MACRO2): ADLM;
MPRINT(MACRO2): guessingrows=500;
MPRINT(MACRO2): run;
ERROR: Library name is not assigned. /*repeats this error 14 times, once per file*/
Two questions:
What am I missing in %macro2?
Do you see a better solution that I am not using? The files are structured differently and not stackable, just a heads up.

From your log we can see a period is being inserted into the output dataset name. Just remove that extra period in your macro definition.
MPRINT(MACRO2): proc import datafile="C:\ALL\dat\m.6.csv" out=m.6 replace dbms=csv;
The extra & in the code is probably confusing you. When the macro processor sees two & it converts them to one and then reprocesses the string to further resolve the resulting macro variable references.
The period after a macro variable name is not required when the macro processor can tell that the name has ended. But the periods are needed in some places.
One place in your code is where it is required to make sure the macro processor knows where the name ends (the macro variable is named readcsv not readcsvm ). Another is where you want to place an actual period after the value of a macro variable. You will need to place two periods there since the first will be used by the macro processor when it evaluates the macro variable value.
In this version of macro2 I have removed the periods after the macro variable names in the places where they are not required just to emphasize the places where the period is required.
%let rawcsv = C:\ALL\dat\;
%macro macro2(first, last);
%local i ;
%do i=&first %to &last ;
proc import dbms=csv
datafile="&rawcsv.m&i..csv"
out=m&i replace
;
guessingrows=500;
run;
%end;
%mend macro2;
%macro2(first=6, last=19)

Small typo here, you need to use an & in front of LAST not the %.
%do i=&first. %to %last. %by 1;
Should be:
%do i=&first. %to &last. %by 1;
Unless you're using a separate macro called last to determine your end of the loop. But in that case you likely wouldn't also have a parameter called last.
If you're looking for alternate options I usually recommend reading all at once using a data step or CALL EXECUTE instead of macro loops as they're infinitely easier to debug in my opinion.
https://communities.sas.com/t5/SAS-Communities-Library/How-do-I-write-a-macro-to-import-multiple-text-files-that-have/ta-p/223627

Related

Add a variable (column) in data set (SAS)

I can't find the solution for this simple problem: I want to add a colum/variable in my data set. This variable will always have the same value, stored in the macro variable &value. And I am in a macro so I don't know if it change anything... This is the step before merging 2 data step.
So far, here's what I have:
%do i=1 %to 10;
data &new_data_set;
set &new_data_set;
Nom_controle=&Nom_Controle;
Partenaire=&Partenaire;
run;
%end;
I'm trying to add to my data-set (which was previously defined in the macro as &new_data_set) a column/variable named "Nom_Controle" which always takes the value stored in the macro variable &Nom_controle (previously defined too). I'm also trying to add a second column/variable named "Partenaire" which always takes the value stored in the macro variable &Partenaire (previously defined too).
Of course, as I'm posting here, my code doesn't work. Can you help me?
EDIT: after some ask me to in order to help me, here is the macro this code is from (the full thing):
%macro presence_mouvement (data_set_detail_mouvement, data_set_mouvement);
%if %sysfunc(exist(&data_set_mouvement)) AND %sysfunc(exist(&data_set_detail_mouvement)) %then %do; *Check if my data set actually exist;
%let suffix=_2;
%let new_data_set=&data_set_detail_mouvement&suffix; *Create the name of the new data set I'm going to save the result of the next proc sql in;
proc SQL noprint; *Proc to look for errors in a previous data set and print it in the new data set;
create table &new_data_set as
insert into &new_data_set
SELECT num_mouvement
FROM &data_set_detail_mouvement
EXCEPT
SELECT num_mouvement
FROM &data_set_mouvement);
%let Nom_controle=Presence_mouvement; *Creation of a new variable;
%if %sysfunc(length(&data_set_detail_mouvement))=29 %then %do; *Creation of a second variable (value conditional to the size of a previous variable);
%let Partenaire=%sysfunc(substr(&data_set_detail_mouvement, 9, 3)); %end;
%else %if %sysfunc(length(&data_set_detail_mouvement))=30 %then %do;
%let Partenaire=%sysfunc(substr(&data_set_detail_mouvement, 9, 4)); %end;
%else %do;
%let Partenaire=%sysfunc(substr(&data_set_detail_mouvement, 9, 6)); %end;
%do i=1 %to 10;
data &new_data_set;
set &new_data_set;
Nom_controle=&Nom_Controle;
Partenaire=&Partenaire;
run;
%end;
%end;*End of the actions to do in case the two data set in parameters exist;
%else %do; *Actions to do in case the two data set in parameters don't exist;
data _null_;
file print;
put #3 #10 "At least one of the data set does not exist";
run;
%end;
*This macro is aiming at pointing error in a previous data set, print them in a new data set and add two new variables/columns to this new data set (indicating their origin). The next set is going to be to merge this new data set to another one;
%mend presence_mouvement;
%presence_mouvement (sasuser.bgpi__detail_mouvement, sasuser.bgpi__mouvement);
I also wanted to say that I tested the rest of the macro before trying to add new variable so the rest of the macro shouldn't have any problem. But who knows...
Run a single data step, setting the new variables to the values setup in macro variables. If the values setup are character in nature the data step variables need to resolve those macro variables within double quotes.
data &new_data_set;
set &new_data_set;
retain
Nom_controle "&Nom_Controle"
Partenaire "&Partenaire"
;
* also works;
* Nom_controle = "&Nom_Controle";
* Partenaire = "&Partenaire";
run;
Note: The new data set variables lengths will be set to the length of the values stored in the macro variables.
A data set is a rectangle of values. It will have a certain number of rows and columns of numeric and / or character types. The SET statement in a DATA step reads one row of the table's column values into the running program data vector -- which are essentially the variables in the DATA step. A DATA step loops automatically and halts automatically on various conditions, such as the last row of a SET table being read.
I don't know why you have a macro loop %DO I=1 %TO 10. I might speculate you think you need to do this in order to 'update' 10 rows in &new_data_set.
What is it really doing ? Running the same code 10 times! Without macro the actual code run is akin to the following
data x; do r = 1 to 10; output; end; run; %* an original new_data_set;
data x; set x; z=1; run;
data x; set x; z=1; run;
data x; set x; z=1; run;
...
One additional concern is the code such as
%if %sysfunc(length(&data_set_detail_mouvement))=29 %then %do; *Creation of a second variable (value conditional to the size of a previous variable);
%let Partenaire=%sysfunc(substr(&data_set_detail_mouvement, 9, 3)); %end;
It appears you are grabbing the first 3, 4, or 6th letters of the data set name from a fully qualified libname.dataset where libname is presumed to be sasuser. A safer and more robust version could be
%let syslast = &data_set_detail_mouvement;
%let libpart = %scan(&syslast,1,.);
%let datapart = %scan(&syslast,2,.);
… extract 3, 4, or 6 preface of datapart …
%* this might be helpful;
%let Partenaire = %scan(&datapart,1,_);
Nothing seems to be wrong with the part of the code that creates the variables. There might be other issues that are difficult to tell from this extract without seeing the entire code or log. For example, if Nom_controle and Partenaire are meant to be character variables because the macro variables are characters but without quotes then there will definitely be errors. You should use symbolgen and mprint options and then post the log to help solve the problem.

Passing SAS dataset column as macro parameter

I have a SAS dataset with values
yyyymm
201605
201606
201607
201608
201609
I am trying to find a way to pass these values one at a time to macro such that
do while dataset still has value
%macro passdata(yyyymm);
end
How can I do this in SAS. Can someone please help with a sample/code snippet.
As mentioned by the prior comment, a way to pass parameters is through call execute routine. Note that this must be done in datastep environment. The lines are read from the set you input.
You can input multiple variables. Just add more variables in '||' separators. Note that the variables may have a lot of whitespaces in them. (==Do comparisons with care.)
Here is a small sample code. Tested.
data start_data;
input date_var ;
datalines;
201605
201606
201607
201608
201609
;
run;
%macro Do_stuff(input_var);
%put 'Line generates value ' &input_var;
%mend do_stuff;
data _null_;
set start_data;
call execute('%do_stuff('||date_var||')' );
run;
Try this example and try modifying to meet your needs... From the "source" dataset we can use call symput() to assign a macro token to each observation (differentiated by the SAS automatic dataset variable n so My_token1, My_token2, etc.) Once you have a set of macro variables defined, just loop through them! This program will print all the individual records from source to the SAS log:
data source;
do var=1000 to 1010;
output;
end;
run;
data _null_;
set source;
call symput(compress("My_token"||_n_),var);
run;
%put &my_token1 &my_token4;
%Macro neat;
%do this=1 %to 11;
*Check the log.;
%put &&My_token&this;
%end;
%mend;
%neat;

coalesce does not work in sas macro?

I was trying to generate a list of variables, whose names are stored in a macro &varsnew. The value of the 1st (2nd, 3rd, etc.) of these variables equals (a) the 1st (2nd, 3rd, etc.) of another list of variables, whose names are stored in another macro &varsold, if the variable in &varsold is not missing, or (b) 0, if the variable in &varsold is missing.
The following code works fine, where I use if-then clause to define variables in &varsnew.
%macro coal;
data DS;
set DS;
%do i=1 %to %sysfunc(countw(&varsold.));
if %scan(&varsold.,&i.)<=.z then %scan(&varsnew.,&i.)=0 ;
else %scan(&varsnew.,&i.)=%scan(&varsold.,&i.);
%end;
run;
%mend;
%coal;
However, if I use the coalesce function to define variables in &varsnew, as in the following, then the code does not work. I am puzzled.
%macro coal;
data DS;
set DS;
%do i=1 %to %sysfunc(countw(&varsold.));
%scan(&varsnew.,&i.)= %sysfunc(coalesce(%scan(&varsold.,&i.),0));
%end;
run;
%mend;
%coal;
Your two loops are doing two different things. The first one is checking if a data set variable is missing and the second is checking if the lists of variable names have the same number of entries. Turn on MPRINT option so you can see what SAS code your macro is generating.
The first one will generate code like:
if OLD1<= .Z then NEW1=0; ELSE NEW1=OLD1;
The second one will generate code like:
NEW1=OLD1;
NEW2=OLD2;
NEW3=0;
You probably want this instead.
%do i=1 %to %sysfunc(countw(&varsold));
%scan(&varsnew,&i)=coalesce(%scan(&varsold,&i),0);
%end;
Or just
%do i=1 %to %sysfunc(countw(&varsold));
%scan(&varsnew,&i)=sum(%scan(&varsold,&i),0);
%end;
Or even better forget the macro logic and just write the whole thing using simple SAS statements.
array _new &varsnew ;
array _old &varsold ;
do i=1 to dim(_new);
_new(i)=sum(_old(i),0);
end;

resolve macro variables for saving/naming a dataset in SAS

I have problem saving a dataset using macro variables to a desired directory.
Basically, I want to save the dataset "_est" to library "sret" according to the values of &var and &age. I wrote the following code:
%let var=k;
%let age=2;
...
...
data sret.est_&var&age._b3;
set _est;
run;
What I want is a dataset named as "est_k2_b3.sas7bdat" in "sret". But what the code gives me is a dataset "est_k2.sas7bdat" saved in the folder I want and another dataset "_b3" in the working library. Both datasets are identical. I'm quite puzzled how to solve this.
As itzy pointed out you have a space after "2" that splits your dataset name in two.
I can replicate the issue only defining the macro variable age with a call symput:
data _null_;
age='2 ';
call symput('age',age);
run;
If this is the case you can solve it by removing the space in the data step with a strip(), using a call symputx() (to be used with numbers) or re-declaring your variable after the data step with a %let, that automatically removes spaces:
%let age= &age.;
Had a very similar problem. Somehow a space is added until you use strip(). Here's the example below.
data test;
input numdays;
datalines;
31
;
%macro monthly(months);
%let count=%sysfunc(countw(&months.));
%do i=1 %to &count.;
%let value=%qscan(&months.,&i,%str(,));
%let month=%sysfunc(strip(&value.));
%put &value.;
%put &month.;
data value_&value.;
set test;
run;
data month_&month.;
set test;
run;
%end;
%mend;
%monthly(%str(oct,jan));

SAS macro do loop-- import multiple flat files

I have a list of 17 flat files that I'm trying to import into different data sets. All of the files have the same data step, so I'm trying to write a do while loop to import all the files.
I've been trying to adapt some code from here without success:
http://www.sas.com/offices/europe/uk/support/sas-hints-tips/tips-enterprise-csv.html
http://support.sas.com/documentation/cdl/en/mcrolref/61885/HTML/default/viewer.htm#a000543785.htm
I'm getting an error that says the %do statement is not valid in open code. Here is my code:
% let path1 = 'c:\path1'
% let path2 = 'c:\path2'
...
% let pathN = 'c:\pathN'
%let n=1;
%do %while (&n <= 17);
%let pathin = &path&n;
data retention&n;
infile &pathin;
<data step-->
run;
%let n=%eval(&n+1);
%end;
I've tested the data step outside of the do-while loop and it works fine for 1 file at a time using the %let pathin = &path&n code. The code still writes the datafile for the 1st data set; but, I need to be able to loop through all the files and can't figure out how. Sorry if this is a novice question; I'm just learning SAS.
Thanks,
-Alex
Welcome to SAS programming! The error message you got is a clue. "Open code" refers to statements that are executed directly by the SAS system. A %do statment is part of the SAS Macro Language, not "normal" SAS. A %let statement can be executed in open code and is use to create a macro variable (distinct from a compiled macro).
Compiled SAS macros are created by code that appears between the %macro and %mend statements. For example, using your code:
%macro run_me;
%let n=1;
%do %while (&n <= 17);
%let pathin = &path&n;
data retention&n;
infile &pathin;
<data step-->
run;
%let n=%eval(&n+1);
%end;
%mend;
But all that does is define/compile the macro. To execute it, you must issue the statement %run_me;. Note that the name run_me was just a name I made up.
For more info, please consult the SAS Macro Reference, especially the introductory section.
To convert your progma to macro, turn your macro variables declared by LET statement into macro arguments:
%macro readfile(n, pathin);
data retention&n;
infile &pathin;
<data step-->
run;
%mend;
A data step to repetitively call your macro.
Here the data included in CARDS statement, but also can be read from some table via SET statement.
The macro call is performed via call execute routine.
data _null_;
length path $200 stmt $250;
input path;
stmt = catt('%readfile(', putn(_N_, 3. -L), path, ')');
call execute(stmt);
cards;
c:\file1.txt
c:\file2.txt
c:\file3.txt
;
run;