I am importing a file with column headers include a $ sign (e.g. "Sales $") with proc import. The result of the import seems to rename that column something like "VAR11".
proc import out = raw
datafile="example.xlsx"
dbms=xlsx replace;
range = "Sheet1$A1:B50";
getnames = yes;
run;
Is there a way to still read in the name of the column, but just drop the $ sign so it is a meaningful header?
If the only problem names are in that format then you should be able to read it and then rename the variable using the label. So get the names and labels into a dataset. You can query dictionary tables, use proc contents, or you could use PROC TRANSPOSE.
proc transpose data=raw (obs=0) out=names ;
var _all_ ;
run;
Now make a list of oldname=newname pairs into a macro variable.
proc sql noprint ;
select catx('=',_name_,translate(trim(compress(_label_,'$')),'_',' '))
into :renames separated by ' '
from names
where upcase(_name_) ne upcase(substrn(_label_,1,32))
;
quit;
You can then use that in a RENAME statement or RENAME= dataset option.
data renamed;
set raw(rename=(&renames)) ;
run;
Related
I'm working on a table in SAS with a column that contains dates (FENTREGA in the picture), i want to complete the empty cells of this column with "Empty", can you help me with the code?
This is the structure of my table, the column i need to use is FENTREGA.
Since your column is numeric you cannot put text in the same column. However, you can make the period appear as the text EMPTY if you use a custom format.
Or you can make the whole column text, but then you cannot do date operations/calculations on the column without converting it back.
proc format;
value empty_dates
. = 'Empty'
Other = [mmddyyd10.];
run;
proc sql;
....
t1.FENTREGA format=empty_dates.,
....
EDIT: Fully tested solution, works as expected
DATA have;
informat FENTREGA mmddyy10.;
format FENTREGA date9.;
input FENTREGA;
datalines;
12/10/2003
10/15/2006
07/20/2010
05/11/2006
10/01/2006
07/03/2012
05/08/2015
.
.
.
.
;
RUN;
proc format;
value empty_dates
. = 'Empty'
Other = [mmddyyd10.];
run;
proc sql;
select
FENTREGA format=empty_dates.
from have;
quit;
I have a customer survey data like this:
data feedback;
length customer score comment $50.;
input customer $ score comment & $;
datalines;
A 3 The is no parking
A 5 The food is expensive
B . I like the food
C 5 It tastes good
C . blank
C 3 I like the drink
D 4 The dessert is tasty
D 2 I don't like the service
;
run;
There is a macro code like this:
%macro subset( cust=);
proc print data= feedback;
where customer = "&cust";
run;
%mend;
I am trying to write a program that call the %subset for each customer value in feedback data. Note that we do not know how many unique values of customer there are in the data set. Also, we cant change the %subset code.
I tried to achieve that by using proc sql to create a unique list of customers to pass into macro code but I think you cannot pass a list in a macro code.
Is there a way to do that? p.s I am beginner in macro
I like to keep things simple. Take a look at the following:
data feedback;
length customer score comment $50.;
input customer $ score comment & $;
datalines;
A 3 The is no parking
A 5 The food is expensive
B . I like the food
C 5 It tastes good
C . blank
C 3 I like the drink
D 4 The dessert is tasty
D 2 I don't like the service
;
run;
%macro subset( cust=);
proc print data= feedback;
where customer = "&cust";
run;
%mend subset;
%macro test;
/* first get the count of distinct customers */
proc sql noprint;
select count(distinct customer) into : cnt
from feedback;quit;
/* do this to remove leading spaces */
%let cnt = &cnt;
/* now get each of the customer names into macro variables
proc sql noprint;
select distinct customer into: cust1 - :cust&cnt
from feedback;quit;
/* use a loop to call other macro program, notice the use of &&cust&i */
%do i = 1 %to &cnt;
%subset(cust=&&cust&i);
%end;
%mend test;
%test;
of course if you want short and sweet you can use (just make sure your data is sorted by customer):
data _null_;
set feedback;
by customer;
if(first.customer)then call execute('%subset(cust='||customer||')');
run;
First fix the SAS code. To test if a value is in a list using the IN operator, not the = operator.
where customer in ('A' 'B')
Then you can pass that list into your macro and use it in your code.
%macro subset(custlist);
proc print data= feedback;
where customer in (&custlist);
run;
%mend;
%subset(custlist='A' 'B')
Notice a few things:
Use quotes around the values since the variable is character.
Use spaces between the values. The IN operator in SAS accepts either spaces or comma (or both) as the delimiter in the list. It is a pain to pass in comma delimited lists in a macro call since the comma is used to delimit the parameters.
You can defined a macro parameter as positional and still call it by name in the macro call.
If the list is in a dataset you can easily generate the list of values into a macro variable using PROC SQL. Just make sure the resulting list is not too long for a macro variable (maximum of 64K bytes).
proc sql noprint;
select distinct quote(trim(customer))
into :custlist separated by ' '
from my_subset
;
quit;
%subset(&custlist)
I have a dataset having 20 observations and 6 variables ID, Gender, Age,Height,Weight,Year. All are numeric except gender variable. I would like to extract 10 observations starting from fifth observation using SAS macros.
I have the code below to import and extract the selected rows from the table.
I want to extract the selected rows using macros as part of an exercise. Please let me know your advice how to use macros to extract specific observations.
Thank you for your time.
%macro one (a, b, c);
proc import out=&a
datafile= "C:\Users\komal\Desktop\&b"
dbms=&c replace;
getnames=yes;
run;
%mend one;
%one (outcsv, Sample.csv, csv);
data test;
set outcsv;
if _N_ in (5,6,7,8,9,10,11,12,13,14) then output;
run;
you could do something like this
%macro one (a, b, c,strtpt,endpt);
proc import out=&a
datafile= "C:\Users\komal\Desktop\&b"
dbms=&c replace;
getnames=yes;
run;
data test;
set &a;
if _n_ >= &strtpt and _n_ =< &endpt;
run;
%mend one;
%one (outcsv, Sample.csv, csv,5,14);
There is no need to use PROC IMPORT to read from a CSV file. Especially if you already know the names/types of the variables. So something like this should work.
data want ;
infile "C:\Users\komal\Desktop\&b" dsd firstobs=5 obs=14 truncover ;
input ID Gender $ Age Height Weight Year ;
run;
You might need to use 6 to 15 instead if the file has a header row.
I have a program that should merge any number of tables numbered consecutively. I tried to use macro variables but to no avail. The error " Missing numeric suffix on a numbered data set list" keeps popping up.
Here is the defective code:
DATA INPUTF;
INPUT DSN $;
CARDS;
forum1
forum2
forum3
;
RUN;
DATA forum1;
INPUT contact $ forum1 $;
CARDS;
Mash HERE
Greg HERE
Bob HERE
;
PROC SORT DATA=forum1;
BY contact;
RUN;
DATA forum2;
INPUT contact $ forum2 $;
CARDS;
Mash HERE
Sid HERE
Bob HERE
;
RUN;
PROC SORT DATA=forum2;
BY contact;
RUN;
DATA forum3;
INPUT contact $ forum3 $;
CARDS;
Mash HERE
Sid HERE
Jim HERE
;
RUN;
PROC SORT DATA=forum3;
BY contact;
RUN;
PROC SQL NOPRINT;
SELECT COUNT(*) INTO :n FROM INPUTF;
QUIT;
%MACRO COMBINE(N);
DATA ALLIN;
MERGE forum1-forum&n.;
BY contact;
RUN;
%MEND COMBINE;
%COMBINE;
PROC PRINT DATA=ALLIN;
The code however, works fine when i used a %LET statement as follows:
%let n=3;
DATA ALLIN;
MERGE forum1-forum&n.;
BY contact;
RUN;
PROC PRINT DATA=ALLIN;
The problem is I won't know how many forums are there, and I prefer that the number be based on the input file.
Any help is appreciated! Thanks!
Macro variable scope.
You've created a macro variable N that exists in the global table. When you create the macro, it takes a parameter, also called N which is local and undefined because you didn't pass a valid parameter.
Call your macro with the created parameter N or move the proc SQL into the macro.
%COMBINE(&N);
OR
%MACRO COMBINE;
PROC SQL NOPRINT;
SELECT COUNT(*) INTO :n FROM INPUTF;
QUIT;
DATA ALLIN;
MERGE forum1-forum&n.;
BY contact;
RUN;
%MEND COMBINE;
%COMBINE;
OR
If you only have tables that start with FORUM that you're trying to merge:
DATA ALLIN;
MERGE FORUM: ;
BY contact;
RUN;
So if your dataset InputF has the list of datasets that you want to merge, then put that list into a macro variable. If you always have at least two datasets then no macro logic is required.
proc sql noprint ;
select dsn into :dsnlist separated by ' '
from inputf;
quit;
data allin;
merge &dsnlist ;
by contact;
run;
To handle the case when you have 0 or 1 dataset name in the list you would need to add macro logic. When there is just one you need to use SET instead of MERGE. You could handle that with the IFC() function.
data allin;
%sysfunc(ifc(1=&sqlobs,set,merge)) &dsnlist ;
by contact;
run;
I have a report that is generated once a year. each report has the form of the year inside the name - report-2011.xls, report-2012.xls etc. each report contains the following vars: ID, SAL=average monthly salary of that year, Gender (0=male, 1=female), Married (0=not married, 1=married), I need to create a macro that calculates the mean.std,min and max of the salary, per year in accordance to gender type and married type. in the macro I need to include a parameter for the relevant year.
how do I refer to each type separately in calculating these parameters?
and how do I create a separate parameter for the year var?
proc summary allows you to control exactly which ways you want to cross the data.
%macro report(year);
proc import datafile="/path/to/report-&year..xls"
out= salary_data
dbms=csv replace ;
proc summary data = salary_data;
class married gender;
types married gender married*gender;
var sal;
output out = salary_results mean(sal) = mean_salary std(sal) = std_salary;
* Print the summary;
proc print;
* Delete the data and summary after using them;
proc delete data= salary_data salary_results; run;
%mend report;
Note that proc summary produces other useful information which you can read about here. You can drop them if you don't need them.
I macro is just a way to generate code. So first design the code you want it to generate. Then you can figure out what parts of it vary and replace those with macro variable references. The macro variables then become the parameters for your macro.
proc import datafile="report-2011.xls" out=report_2011 ; run;
proc means data=report_2011 ;
class gender married;
run;
%macro reporting(year, gender, marital_status);
proc means data=data&year min max std; * <== you should have separate datasets for different years
class gender married ;
%mend reporting
%reporting( 2015, 1, 1)
Is that something you are looking for?