SAS Sum function inside macro trouble - macros

data Month1;
input Name $ sales;
cards;
Joyce 235
Marsha 352
Bill 491
Vernon 210
Sally 418
;
data Month2;
input Name $ sales;
cards;
Joyce 169
Marsha 281
Bill 315
Vernon 397
Sally 305
;
data Month3;
input Name $ sales;
cards;
Joyce 471
Marsha 314
Bill 394
Vernon 291
Sally 337
;
data Month4;
input Name $ sales;
cards;
Joyce 338
Marsha 259
Bill 310
Vernon 432
Sally 362
;
data Month5;
input Name $ sales;
cards;
Joyce 209
Marsha 355
Bill 302
Vernon 416
Sally 475
;
data Month6;
input Name $ sales;
cards;
Joyce 306
Marsha 472
Bill 351
Vernon 405
Sally 358
;
options sgen;
%let qtr=qtr1;
%Macro ProcSql;
Proc Sql;
%if &qtr=qtr1 %then %do;
%let month1=month1;
%let month2=month2;
%let month3=month3;
%end;
%else %if &qtr=qtr2 %then %do;
%let month1=month4;
%let month2=month5;
%let month3=month6;
%end;
%else %if &qtr=qtr3 %then %do;
%let month1=month7;
%let month2=month8;
%let month3=month9;
%end;
%else %%if &qtr=qtr4 %then %do;
%let month1=month10;
%let month2=month11;
%let month3=month12;
%end;
create table &qtr as
select &month1.name, &month1.sales as m1sales, &month2.sales as m2sales,
&month3.sales as m3sales, sum(m1sales, m2sales, m3sales) as
qtrsales
from &month1, &month2, &month3
where &month1.name=&month2.name=&month3.name;
select sum(m1sales) as m1total, sum(m2sales) as m2total, sum(m3sales) as
m3total,
sum(qtrsales) as qtrtotal
from &qtr;
%mend ProcSql;
%ProcSql;
I am getting all of the
I am getting this error:
ERROR: Function SUM requires a numeric expression as argument 1.
ERROR: Function SUM requires a numeric expression as argument 2.
ERROR: Function SUM requires a numeric expression as argument 3.
ERROR: The following columns were not found in the contributing tables: m1sales, m2sales, m3sales.
ERROR: File WORK.QTR1.DATA does not exist.

If you want to reference a value derived in the current SELECT statement then you need to add the CALCULATED keyword to your query.
create table &qtr as
select &month1.name
, &month1.sales as m1sales
, &month2.sales as m2sales
, &month3.sales as m3sales
, sum(calculated m1sales,calculated m2sales,calculated m3sales) as qtrsales
from &month1, &month2, &month3
where &month1.name=&month2.name
and &month1.name=&month3.name
;

Get rid of multiple datasets as early as possible.
I'd just concatenate the data into a single dataset. Having multiple identical datasets for mutiple time periods (or other variables) is in my experience one of SAS's worst anti-patterns.
data sales;
set month1 (in=m1) month2 (in=m2) month3 (in=m3) month4 (in=m4) month5 (in=m5) month6 (in=m6);
if m1 then month=1;
if m2 then month=2;
if m3 then month=3;
if m4 then month=4;
if m5 then month=5;
if m6 then month=6;
qtr = ceil(month/3);
run;
With the data in one dataset it's much easier to manipulate. You can easily aggregate it in SQL:
proc sql;
create table monthly_sales as
select qtr,
month,
sum(sales) as monthly_sales
from sales
group by month ;
create table quarterly_sales as
select month,
qtr,
monthly_sales,
sum(monthly_sales) as quarterly_sales
from monthly_sales
group by qtr;
quit;
Or tabulate it:
proc tabulate data=sales;
var sales;
class month qtr;
table qtr*(month all='total')*sales=''*sum='';
run;
Or transpose it:
proc sort data=sales; by name;
proc transpose data=sales out=sales_wide;
by name;
var sales;
id month;
run;
Use macros to generate code, not for control-flow
If you have to use macros, try using a macro to generate code inside a data step instead of looping over multiple datasets. (Macros are supposed to be used to generate code, that's what they were designed for). They far too often get abused as a proxy for program control structures, which often leads to an un-maintainable mess).
Here I use a macro to generate the data step used to concatenate the months, where the number of months is a variable:
%macro myset(months);
set %do i=1 %to &months; month&i (in=m&i) %end; ;
%do i=1 %to &months;
if m&i then month=&i;
%end;
%mend;
data sales;
%myset(months=6);
qtr = ceil(month/3);
run;
If you use options mprint you can see that the generated code is the same as above.

Related

Define a macro variable for the quarter number

Part A:
Define a macro variable for the quarter number. The idea is that this is the only thing the "user" should have to change when running the program for a new quarter.
Part B:
Define macro variables for each month in the quarter and set them equal to a month value that is generated from the quarter number. Hint: %if/%then
Given code:
data Month1;
input Name $ sales;
cards;
Joyce 235
Marsha 352
Bill 491
Vernon 210
Sally 418
;
data Month2;
input Name $ sales;
cards;
Joyce 169
Marsha 281
Bill 315
Vernon 397
Sally 305
;
data Month3;
input Name $ sales;
cards;
Joyce 471
Marsha 314
Bill 394
Vernon 291
Sally 337
;
data Month4;
input Name $ sales;
cards;
Joyce 338
Marsha 259
Bill 310
Vernon 432
Sally 362
;
data Month5;
input Name $ sales;
cards;
Joyce 209
Marsha 355
Bill 302
Vernon 416
Sally 475
;
data Month6;
input Name $ sales;
cards;
Joyce 306
Marsha 472
Bill 351
Vernon 405
Sally 358
;
proc sql;
create table qtr1 as
select Month1.name, month1.sales as m1sales, month2.sales as m2sales,
month3.sales as m3sales, sum(month1.sales, month2.sales, month3.sales) as qtr1sales
from month1, month2, month3
where month1.name=month2.name=month3.name;
select sum(m1sales) as m1total, sum(m2sales) as m2total, sum(m3sales) as m3total,
sum(qtr1sales) as qtr1total
from qtr1;
My solution:
/* question a */
%MACRO qtrn(qtr);
proc print data=&qtr ;
run;
%MEND qtrn;
/* question b */
%Macro Firstqtr(qtr);
%Let I = 1;
%If &qtr = qtr1 %then %do %until (&I > 3);
%Let var&I = Month&I;
%let I = %eval(&I + 1);
%end;
%Mend Firstqtr;
%Firstqtr(qtr);
Can anyone help me figure correct solution?
since this looks like a homework problem, here's the main part of your answer. I'll leave the final select for you to add. Should be pretty simple given the following solution:
%macro qtrSales(qtr);
%do i = 1 %to 3;
%let month&i = month%sysevalf((&qtr-1) * 3 + &i);
%put &&month&i;
%end;
proc sql;
create table qtr&qtr as
select &month1..name,
&month1..sales as &month1.sales,
&month2..sales as &month2.sales,
&month3..sales as &month3.sales,
sum(&month1..sales, &month2..sales, &month3..sales) as qtr&qtr.sales
from &month1, &month2, &month3
where &month1..name=&month2..name=&month3..name;
select sum(&month1.sales) as &month1.total,
sum(&month2.sales) as &month2.total,
sum(&month3.sales) as &month3.total,
sum(qtr&qtr.sales) as qtr&qtr.total
from qtr&qtr;
select sum(&month1.sales) as &month1.total,
sum(&month2.sales) as &month2.total,
sum(&month3.sales) as &month3.total,
sum(qtr&qtr.sales) as qtr&qtr.total
from qtr&qtr;
quit;
%mend qtrSales;
%qtrSales(2);
define a macro variable means simply to use %let to define a macro variable. macro variables are things that you define with %let, call symputx, or select into in SQL, and then reference using &.
%let qtrn = 3;
There you go. The question specified that the user will adjust this, right? So it isn't asking you to do any work on your end, just give the user a place to make this change.
As for the second, I don't entirely understand the hint. It doesn't seem necessary to use conditional logic here. Here's an example of what I'd do.
%let month1 = %eval(3*(&qtrn.-1)+1);
That simply calculates the month number of the first month based on the quarter. Quarter 3 is months 7/8/9, right? 3*(3-1)+1 = 7, 3*(3-1)+2 = 8, 3*(3-1)+2 = 9. (Or you could do it differently, 3*3-2 = 7, 3*3-1 = 8, 3*3 = 9)
Of course, you could do this in a macro with a loop to define them. But it seems excessive to do so - it's not like quarters ever have 4 months in them, or 2, right? They always have 3, it's a defining characteristic of a quarter, so it seems fine to hardcode month1/month2/month3.

SAS Given a start & end date I need to know the dates of each 30 day period AFTER the first 35 days

I am am given 2 dates, a start date and an end date.
I would like to know the date of the first 35 day period, then each subsequent 30 day period.
I have;
start end
22-Jun-15 22-Oct-15
9-Jan-15 15-May-15
I want;
start end tik1 tik2 tik3 tik4
22-Jun-15 22-Oct-15 27-Jul-15 26-Aug-15 25-Sep-15
9-Jan-15 15-May-15 13-Feb-15 15-Mar-15 14-Apr-15 14-May-15
I am fine with the dates calculations but my real issue is creating a variable and incrementing its name. I decided to include my whole problem because I thought it might be easier to explain in its context.
You can solve the problem via following logic:
1) Determining number of columns to be added.
2) Calculating the values for the columns basis the requirement
data test;
input start end;
informat start date9. end date9.;
format start date9. end date9.;
datalines;
22-Jun-15 22-Oct-15
09-Jan-15 15-May-15
;
run;
/*******Determining number of columns*******/
data noc_cal;
set test;
no_of_col = floor((end-start)/30);
run;
proc sql;
select max(no_of_col) into: number_of_columns from noc_cal;
run;
/*******Making an array where 1st iteration(tik1) is increased by 35days whereas others are incremented by 30days*******/
data test1;
set test;
array tik tik1-tik%sysfunc(COMPRESS(&number_of_columns.));
format tik: date9.;
tik1 = intnx('DAYS',START,35);
do i= 2 to %sysfunc(COMPRESS(&number_of_columns.));
tik[i]= intnx('DAYS',tik[i-1],30);
if tik[i] > end then tik[i]=.;
end;
drop i;
run;
Alternate Way (incase you dont want to use proc sql)
data test;
input start end;
informat start date9. end date9.;
format start date9. end date9.;
datalines;
22-Jun-15 22-Oct-15
09-Jan-15 15-May-15
;
run;
/*******Determining number of columns*******/
data noc_cal;
set test;
no_of_col = floor((end-start)/30);
run;
proc sort data=noc_cal;
by no_of_col;
run;
data _null_;
set noc_cal;
by no_of_col;
if last.no_of_col;
call symputx('number_of_columns',no_of_col);
run;
/*******Making an array where 1st iteration(tik1) is increased by 35days whereas others are incremented by 30days*******/
data test1;
set test;
array tik tik1-tik%sysfunc(COMPRESS(&number_of_columns.));
format tik: date9.;
tik1 = intnx('DAYS',START,35);
do i= 2 to %sysfunc(COMPRESS(&number_of_columns.));
tik[i]= intnx('DAYS',tik[i-1],30);
if tik[i] > end then tik[i]=.;
end;
drop i;
run;
My output:
> **start |end |tik1 | tik2 |tik3 |tik4**
> 22Jun2015 |22Oct2015 |27Jul2015| 26Aug2015|25Sep2015|
> 09Jan2015 |15May2015 |13Feb2015| 15Mar2015|14Apr2015|14May2015
I tend to prefer long vertical structures. I would approach it like:
data want;
set have;
tik=start+35;
do while(tik<=end);
output;
tik=tik+30;
end;
format tik mmddyy10.;
run;
If you really need it wide, you could transpose that dataset in a second step.

How To store SAS dates in a macro

I am trying to create different datasets with the names starting from a date which has data for different dates. When i am trying to run the code it is reading the date as numeric numbers and not as Dates. Here is the code
data dates;
input dates mmddyy8. name : $10. ;
format dates date9.;
cards;
01312015 swati
02282015 kangan
01232015 Gotam
04302015 Hushiyar
05172015 yash
09192015 Kuldeep
08302015 David
05172015 yash
11192015 Uninayal
11192015 Uninayal
12032015 sahil
;
data dates;
set dates;
format new date9.;
new=intnx('month', dates, 0, 'e');
run;
proc sql;
select distinct new into : new1 -: new8 from dates;
quit;
%put &new1.;
%macro swati;
%do i= 1 %to 1;
data data_&&new&i.;
set dates;
if new="&&new&i." then output data_&&new&i.;
run;
%end;
%mend;
%swati;
When I try running this code it gives me the error that says it is reading the dates stored in the macro as numbers. How do i make SAS read the dates as just dates only?

set previous month/year as macro variable in SAS

I know something like this
%let start_date = %sysfunc(intnx(day,%sysfunc(date()),-1),DATE9.);
%put &start_date;
But
%let start_month = %sysfunc(month(intnx(month,%sysfunc(date()),-1),DATE9.));
%put &start_month;
or
%let start_date = %sysfunc(intnx(month,%sysfunc(date()),-1),DATE9.);
%put %sysfunc(month(&start_date));
doesn't work.
You need another %SYSFUNC before INTNX for your example to work.
%let start_month = %sysfunc(month(%sysfunc(intnx(month,%sysfunc(date()),-1))));
%put &start_month;
However, I prefer to use DATA NULL where a lot of %SYSFUNC statements are required with %LET. The following gives you the same result.
data _null_;
call symputx('start_month ', month(intnx('month',date(),-1)));
run;
%put &start_month.;

Merging by range of numbers

I'd like to assign a person's name to a number based on a range rather than an explicit number. It's possible to do this using formats, but as I have the names in a dataset I'd prefer to avoid the manual process of writing the proc format.
data names;
input low high name $;
datalines;
1 10 John
11 20 Paul
21 30 George
31 40 Ringo
;
data numbers;
input number;
datalines;
33
21
17
5
;
The desired output is:
data output;
input number name $;
datalines;
33 Ringo
21 George
17 Paul
5 John
;
Thanks for any help.
You can do it like this using PROC SQL:
proc sql;
create table output as
select numbers.number, names.name
from numbers left join names
on numbers.number ge names.low
and numbers.number le names.high
;
quit;
One handy feature of proc format is the ability to use a data set to create the format, instead of typing it in by hand. Your scenario seems like a perfect scenario for this feature.
In the example you give, a few small changes to the "names" data set will put it in a form that can be read by proc format.
For example, if I modify the names data set like so..
data names;
retain fmtname "names" type "N";
input start end label $;
datalines;
1 10 John
11 20 Paul
21 30 George
31 40 Ringo
;
I can then issue this command to build the format based on it.
proc format cntlin=names;run;
Now I can use this format just like you would with any other format. For example, to create a new column that contains the desired "name" based on the number, you could do this:
data numbers;
input number;
number_formatted=put(number,names.);
datalines;
33
21
17
5
;
Here is what the output would look like:
number_
number formatted
33 Ringo
21 George
17 Paul
5 John
Update to address question:
There isn't much difference in coding needed to read from a text file. We just need to set it up so that the output data set has the particular variable names that proc format expects (fmtname, type, start, end , and label).
For example, if I have an external comma-seperated file called "names.csv" that looks like this:
1,10,John
11,20,Paul
21,30,George
31,40,Ringo
Then I simply can change the code that creates the "names" data set so that it looks like this:
data names;
retain fmtname "names" type "N";
infile "<path to file>/names.csv" dsd;
input start end label $;
run;
Now I can run proc format with the cntlin option like I did before:
proc format cntlin=names;run;
I think SQL is more succinct indeed, but if you aren't big fan of it and the numbers come in known increments, you may try something like:
data ranges;
set names;
do number = low to high; /* by ... */
output;
end;
proc sort;
by number;
run;
data output;
merge ranges
numbers ( in = innum )
;
by number;
keep number name;
if innum;
run;
Again, it requires numbers to come in predetermined increments, e.g. integers.