%Do to iterate over a known set of character values in SAS - macros

I would like to create a list of IDs (bee_created) that combine a treatment group (i), replicate (j) and serial number (k). I have 9 replicates to work on, each with slight variations in serial numbers in the treatment groups. So it would a lot more efficient to create all the IDs in macro do loops. I've browsed some articles and I suppose that %scan may be used but I haven't been able to produce working codes.
For each replicate, there are five treatment groups 'aa','ab', 'ac', 'ea' and 'ec'. I am needing a macro that could replace the following 4 data sets. Note the only differences are in the values of i, j and k; else are copy-and-paste. Thanks in advance
/*dataset 1*/
data tag_num_replicate01_02;
length i $5. bee_created $9.;
exp_name="&exp_name";
do i= 'aa','ab', 'ac', 'ea','ec';
do j='01','02';
do k=101 to 210;
bee_created=compress (i||j||'-'||put(k, best.));
rename i=group_id;
output;
end;
end;
end;
run;
/*dataset 2*/
data tag_num_replicate04a;
length i $5. bee_created $9.;
exp_name="&exp_name";
do i= 'aa','ab', 'ac';
do j='04';
do k=501 to 615;
bee_created=compress (i||j||'-'||put(k, best.));
rename i=group_id;
output;
end;
end;
end;
run;
/*dataset 3*/
data tag_num_replicate04b;
length i $5. bee_created $9.;
exp_name="&exp_name";
do i= 'ea';
do j='04';
do k=501 to 623;
bee_created=compress (i||j||'-'||put(k, best.));
rename i=group_id;
output;
end;
end;
end;
run;
/*dataset 4*/
data tag_num_replicate04c;
length i $5. bee_created $9.;
exp_name="&exp_name";
do i= 'ec';
do j='04';
do k=501 to 620;
bee_created=compress (i||j||'-'||put(k, best.));
rename i=group_id;
output;
end;
end;
end;
run;
I tried to generate a list of 'aa', 'ab' and 'ac' but ended up creating four variables i, aa, ab and ac. I would highly appreciate if you could also show me what is wrong in the following codes.
/*macro not working*/
%macro tag_generate(groups=);
data tag_num_test;
exp_name="&exp_name";
%do i= 1 %to 3;
j=%scan(&groups, &i);
output;
%end;
run;
%mend tag_generate;
%tag_generate(groups= aa ab ac);
Chang

I don't think you need a macro do loop or macro scan - you could just use a data step do loop and the regular scan function.
%macro tag_generate(i, j, krange, rep, exp);
data tag_num_replicate&rep;
length i $5. j $5.;
exp_name=&exp;
do inum = 1 to count(&i, ' ')+1;
i = scan(&i, inum);
do jnum = 1 to count(&j, ' ')+1;
j = scan(&j, jnum);
do k = &krange;
bee_created=compress (i||j||'-'||strip(k));
rename i=group_id;
drop inum jnum;
output;
end;
end;
end;
run;
%mend tag_generate;
%tag_generate("aa ab ac ea ec", "01 02", 101 to 202, 01_02, "expname1");
Edit: Tested and fixed. It should work now.

Related

Hash Merge Macro - using a file record indicator "HASH + point = Key"

Looking to update this macro to be HASH + point = key. We have started to exceed our memory limits with our current version of this macro for one of our data runs. The reason I'm asking for help is because I don't have a lot of time and have never really analyzed this code since it wasn't part of my process until recently.
What I don't really understand from, https://www.lexjansen.com/nesug/nesug11/ld/ld01.pdf, is how does the RID get set and how to incorporate it into our macro. I actually don't even know if it is possible to do it this way with our current macro.
Any help would be greatly appreciated.
%macro hashmerge2(varnm,onto,from,byvars,obsqty);
%let data_vars = %trim (&varnm);
%let data_vars_a = %sysfunc(tranwrd(&data_vars.,%str( ),%str(" , ")));
%let data_vars_b = %sysfunc(tranwrd(&data_vars.,%str( ), %str(,)));
%let data_key = %trim (&byvars);
%let data_key = %sysfunc(tranwrd(&data_key.,%str( ), %str(" , ")));
%if %index(&varnm,' ') > 0 %then %let varnm3=%substr(%substr(&varnm,1,%index(&varnm,' ')),1,4);
%else %let varnm3=%substr(&varnm,1,4);
data &onto(drop=rc) miss&varnm3(drop=rc);
if 0 then set &onto &from(keep=&varnm. &byvars.);
declare hash h_merge (dataset: "&from.");
rc = h_merge.DefineKey ("&data_key.");
rc = h_merge.DefineData ("&data_vars_a.");
rc = h_merge.DefineDone ();
do until (eof);
set &onto end = eof;
call missing(&data_vars_b.);
rc = h_merge.find ();
if rc = 0 then do;
output &onto;
from = "&from.";
end;
else do;
output miss&varnm3 &onto;
from = "&onto.";
end;
end;
stop;
run;
%mend;
So I think this is what you are looking for, but it still needs to load all of the key values from the "lookup" table into the hash object. But it could save space by instead of also loading the non-key variables it just needs to load the observation number that matches the key variables.
%macro hash_merge_point
/*-----------------------------------------------------------------------------
Merge variables ONTO large table FROM small table using POINT= dataset option.
-----------------------------------------------------------------------------*/
(varnm /* Space delimited list of variable to retrieve */
,onto /* Dataset to update */
,from /* Dataset to get values from */
,byvars /* Space delimited list of key variables to match on */
);
%local missds key_vars;
%let missds=%scan(&varnm,1,%str( ));
%let missds=miss%substr(&missds,1,%sysfunc(min(28,%length(&missds))));
%let key_vars="%sysfunc(tranwrd(%sysfunc(compbl(&byvars)),%str( )," "))";
data &onto(drop=rc) &missds(drop=rc);
if 0 then set &onto &from(keep=&varnm. &byvars.);
declare hash h_merge ();
rc = h_merge.DefineKey (&key_vars);
rc = h_merge.DefineData ('_point');
rc = h_merge.DefineDone ();
do _point=1 to _nobs;
set &from(keep=&byvars) point=_point nobs=_nobs;
rc = h_merge.add();
end;
do until (eof);
set &onto end = eof;
rc = h_merge.find ();
if rc = 0 then do;
set &from (keep=&varnm) point=_point;
from = "&from.";
output &onto;
end;
else do;
call missing(of &varnm);
from = "&onto.";
output ;
end;
end;
stop;
run;
%mend hash_merge_point;
So here is an trivial example:
data lookup;
input id age sex $1.;
cards;
1 10 F
2 20 .
4 30 M
;
data master ;
input id wt ;
cards;
1 100
2 150
3 180
4 200
;
%hash_merge_point
/*-----------------------------------------------------------------------------
Merge variables ONTO large table FROM small table using POINT= dataset option.
-----------------------------------------------------------------------------*/
(varnm=age sex /* Space delimited list of variable to retrieve */
,onto=master /* Dataset to update */
,from=lookup /* Dataset to get values from */
,byvars=id /* Space delimited list of key variables to match on */
);
If the target table already has the variables being created by the merge (so you just want to overwrite the current values) then you can use the MODIFY statement instead of the SET statement to modify the dataset in place. But you might want to make sure you have a backup of the table before trying this. Also note that if you want flag for the source, the from variable, then that variable also needs to exist.
So with this updated master table:
data master ;
input id wt ;
length age 8 sex $1 from $50;
cards;
1 100
2 150
3 180
4 200
;
And this version of the macro:
%macro hash_merge_point
/*-----------------------------------------------------------------------------
Merge variables ONTO large table FROM small table using POINT= dataset option.
-----------------------------------------------------------------------------*/
(varnm /* Space delimited list of variable to retrieve */
,onto /* Dataset to update */
,from /* Dataset to get values from */
,byvars /* Space delimited list of key variables to match on */
);
%local key_vars;
%let key_vars="%sysfunc(tranwrd(%sysfunc(compbl(&byvars)),%str( )," "))";
data &onto;
if 0 then set &onto (keep=&byvars.);
declare hash h_merge ();
rc = h_merge.DefineKey (&key_vars);
rc = h_merge.DefineData ('_point');
rc = h_merge.DefineDone ();
do _point=1 to _nobs;
set &from(keep=&byvars) point=_point nobs=_nobs;
rc = h_merge.add();
end;
do until (eof);
modify &onto end = eof;
rc = h_merge.find ();
if rc = 0 then do;
set &from (keep=&varnm) point=_point;
from = "&from.";
end;
else from = "&onto.";
replace;
end;
stop;
run;
%mend hash_merge_point;
If you run this code:
proc print data=master;
title 'BEFORE';
run;
%hash_merge_point
/*-----------------------------------------------------------------------------
Merge variables ONTO large table FROM small table using POINT= dataset option.
-----------------------------------------------------------------------------*/
(varnm=age sex /* Space delimited list of variable to retrieve */
,onto=master /* Dataset to update */
,from=lookup /* Dataset to get values from */
,byvars=id /* Space delimited list of key variables to match on */
);
proc print data=master;
title 'AFTER';
run;
You get this result:

SAS: formatting multiple proc freq using macros

I don't have another analyst on my team at work and have a question about the most efficient way to run several proc freq concurrently.
My goal is to run about 160 different frequencies, and include formatting for all of them. I assume a macro is the fastest way, but I only have experience with basic macros. Below is my thought process assuming the data was already formatted:
%macro survey(question, formatA formatB);
proc freq;
table &question;
format &formatA &formatB;
%mend;
%survey (question, formatA, formatB);
"question", "formatA" and "formatB" will be strings of data for example:
-"question" would be KCI_1 KCI_2 through KCI_80
- "formatA" would be KCI_1fmt KCI_2fmt through KCI_80fmt
- "formatB" would be KCI_1fmt. KCI_2fmt. through KCI_80fmt.
Danielle:
You can use macro to assign known formats to variables that are not already formatted. The rest of the FREQ does not have to be macro-ized.
* make some survey data with unformatted responses;
data have;
do respondent_id = 1 to 10000;
array responses KCI_1-KCI_80;
do _n_ = 1 to dim(responses);
responses(_n_) = ceil(4*ranuni(123));
end;
output;
end;
run;
* make some format data for each question;
data responseMeanings;
length questionID 8 responseValue 8 responseMeaning $50;
do questionID = 1 to 80;
fmtname = cats('Q',questionID,'_fmt');
peg = ranuni (1234); drop peg;
do responseValue = 1 to 4;
select;
when (peg < 0.4) responseMeaning = scan('Never,Seldom,Often,Always', responseValue);
when (peg < 0.8) responseMeaning = scan('Yes,No,Don''t Ask,Don''t Tell', responseValue);
otherwise responseMeaning = scan('Nasty,Sour,Sweet,Tasty', responseValue);
end;
output;
end;
end;
run;
* create a custom format for the responses of each question;
proc format cntlin=responseMeanings(rename=(responseValue=start responseMeaning=label));
run;
* macro to associate variables with the corresponding custom format;
%macro format_each_response;
%local i;
format
%do i = 1 %to 80;
KCI_&i Q&i._fmt.
%end;
;
%mend;
* compute frequency counts;
proc freq data=have;
table KCI_1-KCI_80;
%format_each_response;
run;

Create dictionary from table using hash object

I'm sure it's an easy task, but I'm still neophyte at SAS, so I have some problems :(.
Consider, I have some data set Table with column Column $24 (length is important). And I want to create a data set HashTable with only one column Key $11, where Key consists of those unique values of Column, which length equals 11.
So, I'm trying to use a hash object, but I feel I'm doing something wrong :).
data _null_;
length Key $11;
set Table end = _end;
if _N_ = 1 then do;
declare hash h();
h.defineKey('Key');
h.defineDone();
end;
if length(Column) = 11 then
rc = h.add(Key: Column);
if _end then
rc = h.output(dataset: 'HashTable');
run;
When I submit program I get errors:
7904 rc = h.add(Key: Column);
ERROR: Incorrect number of data entries specified at line 7904 column 14.
ERROR: DATA STEP Component Object failure. Aborted during the EXECUTION phase.
This issue has been identified by Paul Dorfman in 2007 on SAS-L, Paul Dorfman on Hash error
Simply put, ADD() method is expecting either both 'key' and 'data' or none (which implies both). The following also works, even you 'data' hasn't been defined explicitly.
data _null_ ;
length key $ 5;
set sashelp.shoes end = _end ;
if _N_ = 1 then do ;
declare hash h() ;
h.defineKey('key') ;
h.defineDone() ;
end ;
if length(Subsidiary)=5 then rc=h.add(key:subsidiary, data:Subsidiary) ;
if _end then h.output(dataset: 'HashTable') ;
run ;
Your code looks pretty good to me. The one mistake I notice is that your statements:
if length(Column) = 11 then;
rc = h.add(Key: Column);
if _end then;
rc = h.output(dataset: 'HashTable');
have an extra semicolon. There should be no semicolon after the then. As written, the h.add and h.output are executed unconditionally.
Here is an example using sashelp.shoes along the lines of your examole:
data _null_ ;
set sashelp.shoes end = _end ;
if _N_ = 1 then do ;
declare hash h() ;
h.defineKey('Subsidiary') ;
h.defineDone() ;
end ;
if length(Subsidiary)=5 then rc=h.add() ;
if _end then h.output(dataset: 'HashTable') ;
run ;
Which returns:
115 data _null_ ;
116 set HashTable ;
117 put (_all_)(=) ;
118 run ;
Subsidiary=Tokyo
Subsidiary=Cairo
Subsidiary=Paris
Subsidiary=Seoul
Subsidiary=Dubai
NOTE: There were 5 observations read from the data set WORK.HASHTABLE.
A mistake was stupid, as I expected:)
The right code (I hope):
data _null_;
length Key $11;
set Table end = _end;
if _N_ = 1 then do;
declare hash h();
h.defineKey('Key');
h.defineDone();
end;
if length(Column) = 11 then do;
Key = Column;
rc = h.add();
end;
if _end then
rc = h.output(dataset: 'HashTable');
run;

doing a do loop within macro sas

I have the following code:
%macro initial (first=, second=, third=, fourth=, final=);
data &first;
set wtnodup.&first;
DATE1 = INPUT(PUT(Date,8.),YYMMDD8.);
format DATE1 monyy7.;
RUN;
proc freq data=&first order= freq;
tables date1*jobboardid / list out=&second (drop = percent rename=
(Count=CountNew));
run;
data &third;
set &second (firstobs=2);
if countnew le 49 then delete;
run;
proc sort data = &third;
by jobboardid Date1;
run;
data &fourth (keep = countnew oldcountnew Date1 rate from till jobboardid
rate);
set &third;
by jobboardid Date1;
format From Till monyy7.;
from = lag12(Date1);
oldcountnew = lag12(countnew);
if lag12(jobboardid) EQ jobboardid and
INTCK('month', from, Date1) EQ 12 then do;
till = Date1;
rate = ((countnew/oldcountnew)-1)*100;
output;
end;
run;
proc sort data = &fourth;
by Date1 rate;
proc means data=&fourth noprint;
by Date1;
output out=Result.&final median(rate)=medianRate;
run;
%mend initial;
%initial (first = Alabama, second = AlabamaOne, third =AlabamaTwo,
fourth = AlabamaThree, final=AL_10);
%initial (first = Alaska, second = AlaskaOne, third =AlaskaTwo,
fourth = AlaskaThree, final=AK_10);
%initial (first = Arizona, second = ArizonaOne, third =ArizonaTwo,
fourth = ArizonaThree, final=AZ);
%initial (first = Arkansas, second = ArkansasOne, third =ArkansasTwo,
fourth= ArkansasThree, final=AR_10);
What I am trying to do is that in the part that puts the condition:
if countnew < 10 then delete;
I want to create a sort of do-loop that would delete the data when countnew is <10,20,30....until 70, and creates a separate data-set for each of of the iteration of when countnew is <10, 20, etc.
So I would have a final data-set for of the different iteration of when countnew
What is the best way about doing this?
Why not do-looping, ten by ten, and adding the iteration extension to the dataset name like this?
** Sample dataset;
data try;
do i=1 to 1000;
value=1+ranuni(12345)*100;
output;
end;
drop i;
run;
** Macro iterator:
%macro iter(ds=);
%do i=10 %to 70 %by 10;
data &ds._&i;
set &ds;
if value le &i then delete;
run;
%end;
%mend;
%iter (ds=try)
you will have 7 dataset named try_10--try_70 where try will be replaced with the dataset name.

How to compare date values in a macro?

Here is the macro I'm running....
%macro ControlLoop(ds);
%global dset nvars nobs;
%let dset=&ds;
/* Open data set passed as the macro parameter */
%let dsid = %sysfunc(open(&dset));
/* If the data set exists, then check the number of obs ,,,then close the data set */
%if &dsid %then %do;
%If %sysfunc(attrn(&dsid,nobs))>0 %THEN %DO;;
%local dsid cols rctotal ;
%let dsid = %sysfunc(open(&DS));
%let cols=%sysfunc(attrn(&dsid, nvars));
%do %while (%sysfunc(fetch(&dsid)) = 0); /* outer loop across rows*/
/*0:Success,>0:NoSuccess,<0:RowLocked,-1:eof reach*/
%If fmt_start_dt<=&sysdate9 and fmt_end_dt>=sysdate9 %then %Do;
%do i = 1 %to &cols;
%local v t; /*To get var names and types using
varname and vartype functions in next step*/
%let v=%sysfunc(varname(&dsid,&i)); /*gets var names*/
%let t = %sysfunc(vartype(&dsid, &i)); /*gets variable type*/
%let &v = %sysfunc(getvar&t(&dsid, &i));/*To get Var values Using
GetvarC or GetvarN functions based on var data type*/
%end;
%CreateFormat(dsn=&dsn, Label=&Label, Start=&Start, fmtName=&fmtName, type=&type);
%END;
%Else %put ###*****Format Expired*****;
%END;
%END;
%else %put ###*****Data set &dset has 0 rows in it.*****;
%let rc = %sysfunc(close(&dsid));
%end;
%else %put ###*****open for data set &dset failed - %sysfunc(sysmsg()).*****;
%mend ControlLoop;
%ControlLoop(format_control);
FOrmat_Control Data:
DSN :$12. Label :$15. Start :$15. fmtName :$8. type :$3. fmt_Start_dt :mmddyy. fmt_End_dt :mmddyy.;
ssin.prd prd_nm prd_id mealnm 'n' 01/01/2013 12/31/9999
ssin.prd prd_id prd_nm mealid 'c' 01/01/2013 12/31/9999
ssin.fac fac_nm onesrc_fac_id fac1SRnm 'n' 01/01/2013 12/31/9999
ssin.fac fac_nm D3_fac_id facD3nm 'n' 01/01/2013 12/31/9999
ssin.fac onesrc_fac_id D3_fac_id facD31SR 'n' 01/01/2013 02/01/2012
oper.wrkgrp wrkgrp_nm wrkgrp_id grpnm 'n' 01/01/2013 12/31/9999
How Can i compare fmt_Start_dt and fmt_end_dt with sysdate ?
I tried something like %If fmt_start_dt<=&sysdate9 and fmt_end_dt>=sysdate9 %then %Do; in the code but values are not picking up in the loop....Any Idea???
Thanks in advance....
I'm not entirely sure what you want, but I think this might work:
%if &fmt_start_dt <= %sysfunc(today()) and &fmt_end_dt >= %sysfunc(today())
Your FETCH function will copy dataset variables to macro variables, so you need to reference them with an ampersand. Also, you should use the TODAY() function rather than the SYSDATE9 macro variable.