Basically I need to import multiple excel files and simply stack them. I wrote a Macro to do that, but at the end of the code where it comes to the stack part, multiple error messages appear saying:
ERROR: Variable XX has been defined as both character and numeric.
I tried to modify format in the Macro, but it doesn't work.
(I tried "format F1=DDMMYY10, F2=12.8, F7=$12")
I also need to only keep F1-F7. But log says "The DROP and KEEP statements are not supported in procedure steps in this release of the SAS System." Therefore, these statements are ignored.
Will you please take a look at my code and let me know how I should modify it?
Here's the code:
%LET TOTAL=4;
%LET PATH=H:\test\;
%LET INFILE1=a (1).xlsx;
%LET INFILE2=a (2).xlsx;
%LET INFILE3=a (3).xlsx;
%LET INFILE4=a (4).xlsx;
%MACRO EXCELREAD(I,INFILE);
PROC IMPORT OUT=TEST_&i
DATAFILE="&PATH.&INFILE"
DBMS=EXCEL REPLACE;
GETNAMES=NO;
MIXED=YES;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
/*KEEP F1 F2 F3 F4 F5 F6;
FORMAT F1=DDMMYY10, F2=12.8, F7=$12;*/
RUN;
%MEND;
%EXCELREAD(1,&INFILE1);
%EXCELREAD(2,&INFILE2);
%EXCELREAD(3,&INFILE3);
%EXCELREAD(4,&INFILE4);
data pilot_bond;
set test_1
test_2(firstobs=2)
test_3(firstobs=2)
test_4(firstobs=2)
run;
Here's the error message:
3712 data pilot_bond;
3713 set test_1
3714 test_2(firstobs=2)
3715 test_3(firstobs=2)
3716 test_4(firstobs=2)
ERROR: Variable F1 has been defined as both character and numeric.
ERROR: Variable F2 has been defined as both character and numeric.
ERROR: Variable F6 has been defined as both character and numeric.
ERROR: Variable F7 has been defined as both character and numeric.
ERROR: Variable F6 has been defined as both character and numeric.
ERROR: Variable F6 has been defined as both character and numeric.
3730 run;
Any help will be greatly appreciated!
Sugguestions to further simplifying the code is also very welcome.
I am not sure as to how your worksheet looks, but by looking at the error it seems that the column F1 is a character value in excel, If you open excel and if you see a green marker in the FI data cell then it is a character value then ,you need to
1. export the data in .xlsx files using "Import" wizard first and
2. Then run the above program
This should solve the issue you have.
Related
I have a directory of csv files, each with names that begin with the letter m and end with a number. There are twelve files - m6 to m17.
I'd like to read them in and process them as separate data sets. I've written two macros attempting to do so. Macro1 works. Macro2 breaks. I would prefer Macro2 if I can get it to work, to avoid unnecessary bits like my creation of %rawfiles, invocation of %sysfunc, etc.
Macro 1:
%let rawcsv = C:\ALL\dat\;
%let rawfiles = m6 m7 m8 m9 m10 m11 m12 m13 m14 m15 m16 m17;
%macro1;
%do i = 1 %to %sysfunc(countw(&rawfile));
%let rawfile = %scan(&rawfiles, &i);
proc import datafile="&&rawcsv.&&rawfile.csv"
out=&rawfile replace
dbms=csv;
guessingrows=500;
run;
%end;
%mend;
%macro1;
Macro 2:
%let rawcsv = C:\ALL\dat\;
%macro macro2(first=6, last=19);
%do i=&first. %to &last. %by 1;
proc import datafile="&&rawcsv..m&&i.csv"
out=m&i replace
dbms=csv;
guessingrows=500;
run;
%end;
%mend;
%macro2;
%macro2 is my bad imitation of this solution. It returns the following errors:
MPRINT(MACRO2): proc import datafile="C:\ALL\dat\m.6.csv" out=m.6 replace
dbms=csv;
MPRINT(MACRO2): ADLM;
MPRINT(MACRO2): guessingrows=500;
MPRINT(MACRO2): run;
ERROR: Library name is not assigned. /*repeats this error 14 times, once per file*/
Two questions:
What am I missing in %macro2?
Do you see a better solution that I am not using? The files are structured differently and not stackable, just a heads up.
From your log we can see a period is being inserted into the output dataset name. Just remove that extra period in your macro definition.
MPRINT(MACRO2): proc import datafile="C:\ALL\dat\m.6.csv" out=m.6 replace dbms=csv;
The extra & in the code is probably confusing you. When the macro processor sees two & it converts them to one and then reprocesses the string to further resolve the resulting macro variable references.
The period after a macro variable name is not required when the macro processor can tell that the name has ended. But the periods are needed in some places.
One place in your code is where it is required to make sure the macro processor knows where the name ends (the macro variable is named readcsv not readcsvm ). Another is where you want to place an actual period after the value of a macro variable. You will need to place two periods there since the first will be used by the macro processor when it evaluates the macro variable value.
In this version of macro2 I have removed the periods after the macro variable names in the places where they are not required just to emphasize the places where the period is required.
%let rawcsv = C:\ALL\dat\;
%macro macro2(first, last);
%local i ;
%do i=&first %to &last ;
proc import dbms=csv
datafile="&rawcsv.m&i..csv"
out=m&i replace
;
guessingrows=500;
run;
%end;
%mend macro2;
%macro2(first=6, last=19)
Small typo here, you need to use an & in front of LAST not the %.
%do i=&first. %to %last. %by 1;
Should be:
%do i=&first. %to &last. %by 1;
Unless you're using a separate macro called last to determine your end of the loop. But in that case you likely wouldn't also have a parameter called last.
If you're looking for alternate options I usually recommend reading all at once using a data step or CALL EXECUTE instead of macro loops as they're infinitely easier to debug in my opinion.
https://communities.sas.com/t5/SAS-Communities-Library/How-do-I-write-a-macro-to-import-multiple-text-files-that-have/ta-p/223627
I tried to import text file in sas with the following code
PROC IMPORT DATAFILE= '/home/u44418748/MSc Biostatistics with SAS/Datasets/school.txt'
OUT= outdata
DBMS=dlm
REPLACE;
delimiter='09'x;
GETNAMES=YES;
RUN;
But I am getting import unsuccessful because text file has period for missing data
this is what i got in SAS log
NOTE: Invalid data for class_size in line 455 16-17.
455 CHAR 454.34.8.32.17.NA.23.125.12.188 31
ZONE 3330330303303304403323330332333
NUMR 454934989329179E1923E125912E188
sl_no=454 school=34 iq=8 test=32 ses=17 class_size=. meanses=23.125 meaniq=12.188 _ERROR_=1 _N_=454
how can load this text file in SAS
Did you create that text file from R? That package has a nasty habit of putting text values of NA for numeric values into text files. If you are the one that created the file the you might check if the system you are using has a way to not put the NA into the file to begin with. In a delimited file missing values are normally represented by having nothing for the field. So the delimiters are right next to each other. For SAS you can use a period to represent a missing value.
I wouldn't bother to use PROC IMPORT to read a delimited file. Just write a data step to read the file. Since it looks like your file only has six variables and they are all numeric the code is trivial.
data outdata;
infile '/home/u44418748/MSc Biostatistics with SAS/Datasets/school.txt'
dsd dlm='09'x firstobs=2 truncover
;
input sl_no school iq test ses class_size meanses meaniq ;
run;
One way to deal with the NA text in the input file is to replace them with periods. Since all of the fields are numeric you can do that easily because you don't have to worry about replacing real text that just happens to have the letter A after the letter N. Here is trick using the _INFILE_ automatic variable that you can use to make the change on the fly while reading the file.
data outdata;
infile '/home/u44418748/MSc Biostatistics with SAS/Datasets/school.txt'
dsd dlm='09'x firstobs=2 truncover
;
input #;
_infile_=tranwrd(_infile_,'NA','.');
input sl_no school iq test ses class_size meanses meaniq ;
run;
You are getting the NOTE: because of the NA value in the class_size field.
What you presume are periods (.) are actually tabs (hex code 09). Look under the period to confirm, the ZONE is 0 and NUMR 9. 09 is the tab character.
Proc IMPORT guesses each fields data type based on looking at the first few rows (default is 20 rows) of a text file. Your file contained only numbers the 20 rows, so the procedure guessed class_size was numeric.
There a couple of courses of action.
Do nothing. Read your log NOTES and know the places where NA occurred you will have a missing value in your data set.
or,Read the file as-is, but add GUESSINGROWS=MAX; statement to your import code
The mixed data type column class_size will be guessed as character and you might have to do another step to convert the values to numeric (a step in which the non-digit values get converted to missing values)
or, Edit the text file replacing all the NA with a period (.). The dot marks a missing value during IMPORT. The IMPORT step will have no incongruities to LOG about.
Converting a field
PROC IMPORT DATAFILE= '/home/u44418748/MSc Biostatistics with SAS/Datasets/school.txt'
DBMS=dlm REPLACE OUT=work.outdata;
delimiter='09'x;
GETNAMES=YES;
GUESSINGROWS=MAX;
RUN;
data want;
set outdata (rename=(class_size=class_size_char));
class_size = input (class_size_char, ?? best12.);
drop class_size_char;
run;
I have a text file "Macro definition" which has two SAS macros definition. I would like to import them and apply on HTWT.csv data set. This dataset has 20 observations and 6 variables ID, Gender, Age,Height,Weight,Year. All are numeric except gender variable. I have the code below to import and apply the macros from the txt file to csv file. I am getting an error message on running this code as below.
outcsvv is the name of HTWT dataset imported in SAS.
%include "C:\Users\komal\Desktop\Advanced SAS\Macro definition.txt";
%contents_of(outcsvv)
%print_data(outcsvv)
Warning:Apparent Invocation of macro "contents_of" not resolved
Error: Unable to complete processing of INCLUDE. Expected a filename or fileref
Expected a statement keyword: found "("
The second error I am getting is probably due to the macro definition(s) from the text file which are as follows.
%macro contents_of(name);
proc contents data=&name;
run;
%mend;
%macro print_data(name);
proc print data=&name;
run;
%mend;
Please let me know your advice on how to solve it. Thank you for your time.
You can setup your own macro Autocall Library. Just split your file; save one macro per-file. more.
or you can add this code to the begining of your program:
options insert=(sasautos="/C:\Users\komal\Desktop\Advanced SAS") ;
this will search for the macros in this directory.
I need to create sum of 4 variables multiple times each time with new set of variables. For e.g. A1=sum(a1,a2,a3,a4),B1=sum(b1,b2,b3,b4) & so on. So , I am trying to write a macro that will help me do it easily. Following is the code:
%macro SUM2(VAR1,var2,var3,VAR4);
data Subs_60_new;
set Subs_60;
substr(&var1,1,10)=sum(&var1,&var2,&var3,&var4);
run;
%mend sum2;
options mprint mlogic;sum2(ADDITIONAL_INFO_Q1,ADDITIONAL_INFO_Q2,ADDITIONAL_INFO_Q3,ADDITIONAL_INFO_Q4);
I am using SAS EG for the same & when I run the macro I get the following note:
NOTE: Writing TAGSETS.SASREPORT13(EGSR) Body file: EGSR
& obviously when I try to execute the macro it throws an error.
Can some one help me out?
when calling a macro, you need to precede the macro name with a % symbol, eg as follows:
%macro SUM2(VAR1,var2,var3,VAR4);
data Subs_60_new;
set Subs_60;
substr(&var1,1,10)=sum(&var1,&var2,&var3,&var4);
run;
%mend sum2;
options mprint mlogic;
%sum2(ADDITIONAL_INFO_Q1,ADDITIONAL_INFO_Q2,ADDITIONAL_INFO_Q3,ADDITIONAL_INFO_Q4);
The NOTE is harmless. It is ERRORs and WARNINGs in general that you should be concerned with.
I'd point out that this will probably still throw an error, as you are trying to replace characters in a variable (&var1) that appears as though it should contain a numeric field (being part of a sum function). Given your description of what you are trying to achieve, I'd suggest adding the new variable name as another macro parameter - as follows:
%macro SUM2(VAR1,var2,var3,VAR4,varname);
data Subs_60_new;
set Subs_60;
&varname=sum(&var1,&var2,&var3,&var4);
run;
%mend sum2;
options mprint mlogic;
%sum2(ADDITIONAL_INFO_Q1,ADDITIONAL_INFO_Q2
,ADDITIONAL_INFO_Q3,ADDITIONAL_INFO_Q4
,MyNewVariable);
CMS released a SAS macro that checks for the existence of a file:
** check existance of dataset**;
%macro CHECKDS(FILE,LONGFILE);
%if %sysfunc(exist(&FILE)) %then;
%else %do;
data _null_;
file print ls=255;
&MSG30 put "ERROR : [Msg30] Program halted, file &LONGFILE does not exist";
abort; run;
%end;
%mend CHECKDS;
Now when I call it with this:
LIBNAME IN1 "/folders/myfolders/";
%CHECKDS(&STPERSON.TXT,PERSON)
run;
I get this error: ERROR : [Msg30] Program halted, file PERSON does not exist.
I know that the files exists and is in that location. Any ideas?
The first argument to the exist function should be in the format libname.memname. The second argument to the exist function specifies the member type; since you didn't specify the member type, the default, DATA, is assumed. This implies a file with a SAS data file with a sas7bdat extension.
See here for a list of member types.
Since your file is a .txt file, I don't think it can be considered a library member. Anyone is welcome to correct me if I'm wrong.
DWal got it right above. Here's a more complete answer for what I had to do. I had to convert the .txt file to a sas data set (particularly .sas7bdat).
proc import datafile="/folders/myfolders/PERSON"
dbms=dlm
out=person
replace;
delimiter=' ';
getnames=yes;
run;
Then that had to be written to my library and I was able to use it.
data IN1.person ;
set person;
run;