Is there a way to pass an input to a netlogo procedure in a way that the value of the input can be modified from within the procedure? For example:
to test
let value 200
test2 value
print value
end
to test2 [v]
set v v + 1
end
If you run this, it will output 200. I would like to modify it in a way (without using global variables and without using reporter procedures) in order to have the output 201.
You can use a mutable object such as an array or a table.
extensions [table array]
to test
let _a array:from-list n-values 10 [0]
increment-aval _a 0
print _a
let _t table:make
let _key "key1"
table:put _t _key 0
increment-tval _t _key
print _t
end
to increment-aval [#a #i]
let _old array:item #a #i
array:set #a #i (1 + _old)
end
to increment-tval [#t #key]
let _old table:get #t #key
table:put #t #key (1 + _old)
end
Of course it is better not to use mutability if you do not need to.
Related
I'm testing out how to use hash objects in SAS 9.4 M6 to do fuzzy joins since PROC SQL just runs for hours on my larger dataset. I created some sample datasets (below) and what I want is for the merge to pull in exact matches on the "name" fields AND any matches that have a COMPLEV score less than 10. Right now, this code still only pulls in the exact matches.
I'm very new to hash objects so I'm sure it's a simple fix but I've tried am in need of help.
data A;
infile datalines missover;
length nameA $50;
input nameA $ ;
datalines;
MICKEYMOUSE2000-01-02
DAFDUCK1990-09-23
GOOFYMAN1993-05-11
;
run;
*second dataset with one exact match and two that differ slightly from those in dataset A;
data B;
infile datalines missover;
length nameB $50;
input nameB $ VDAY :ddmmyy10.;
format VDAY ddmmyy10.;
datalines;
MICKEYMOUSE2000-01-01 07/08/2021
DAFFYDUCK1990-09-23 05/11/2021
GOOFYMAN1993-05-11 08/11/2021
;
run;
*only pulling in exact matches, want it to pull in other fuzzy matches;
data simplemerge ;
if 0 then set work.B ; *load var properties into hash table;
if _n_ = 1 then do;
dcl hash B (dataset: 'work.B'); *declare the name B for hash using B dataset;
B.definekey('nameB');*identify var in B to use as key;
B.definedata('nameB','vday');*identify columns of data to bring in from B dataset;
B.definedone();*complete hash table definition;
end;
set work.A; *bring in A data;
if B.Find(KEY: nameA) ne 0 then do;
if complev(nameA, nameB) < 10 then do;
B.ref(key : nameB,data : nameB, data : vday);
end;
end;
RUN;
Fuzzy match in hash is not necessarily better than SQL - in fact, good chance it's identical. SQL joins are often done with a hash table behind the scenes.
That said, here's how you'd do the naive hash fuzzy lookup - with a hash iterator (hiter).
data simplemerge ;
if 0 then set work.B ; *load var properties into hash table;
if _n_ = 1 then do;
dcl hash B (dataset: 'work.B'); *declare the name B for hash using B dataset;
B.definekey('nameB');*identify var in B to use as key;
B.definedata('nameB','vday');*identify columns of data to bring in from B dataset;
B.definedone();*complete hash table definition;
dcl hiter hi_b('B');
end;
set work.A; *bring in A data;
done = 0;
if B.Find(KEY: nameA) ne 0 then do;
rc = hi_b.first();
do while (rc eq 0 );
if complev(nameA, nameB) lt 10 then do;
put "Found one!" namea= nameb=;
leave;
end;
else call missing(of nameB vday);
rc = hi_b.next();
end;
end;
else put "Found one!" namea=;
RUN;
This will be ... not fast ... if work.B has a lot of rows. It goes over every row of B once for every row of A that doesn't have an exact match.
One thing you can do to make this more efficient is not search all of B. Instead, have some smaller subset of B that you find with an exact match, and then iterate over that smaller subset; instead of using hiter just use the find_next. This may not work for your exact requirements, but if it's feasible, this would be ideal.
Here's one example of doing that. It's not particularly efficient since sex has only two values (so I'm looking through half of the rows anyway), but it does work.
data have;
set sashelp.class;
if mod(_n_,3) eq 0 then do;
name = cats(name,'Z');
end;
if mod(_n_,5) eq 0 then do;
name = cats('Row_',_n_);
end;
run;
data want;
if 0 then set sashelp.classfit;
length name_fuzz $8;
*first define two hash tables - one for exact match, one for fuzzy. Only do this if exact matches are reasonably common;
if _n_ eq 1 then do;
declare hash h_exact(dataset:'sashelp.classfit');
h_exact.defineKey('name');
h_exact.defineData(all:'Y');
h_exact.defineDone();
declare hash h_fuzzy(dataset:'sashelp.classfit(rename=name=name_fuzz keep=name sex predict lower upper lowermean uppermean)',multidata:'y');
h_fuzzy.defineKey('sex');
h_fuzzy.defineData(all:'Y');
h_fuzzy.defineDone();
call missing(name_fuzz);
end;
set have;
*now check exact - if it matches then do not try further;
rc_exact = h_exact.find();
if rc_exact eq 0 then do;
output;
return;
end;
*now try fuzzy - first look up the first match by the chunk criteria;
rc_fuzzy = h_fuzzy.find();
*now iterate over all of the matches of the chunk, and if you find a close-enough match then output that row and stop trying;
do while (rc_fuzzy eq 0);
if complev(name,name_fuzz) lt 2 then do;
output;
return;
end;
rc_fuzzy = h_fuzzy.find_next();
end;
*if you are still here, you failed to find a fuzzy match - so clear the values from the variables you are merging on and output a blank row, assuming you are doing a "left join" [if it is inner join, then just skip these next two lines];
call missing(of predict lower upper lowermean uppermean);
output;
run;
A better version of this would have a more discriminating key for the fuzzy match - the more discriminating the better. The key might not be something related at all to your fuzzy match - for example, maybe your fuzzy match is looking for names, but you also their year of birth. Match on year of birth, then iterate over complev(namea,nameb), since year of birth is quite discriminating.
Lets's say I have a bunch of variables named the same way and I'd like to recode them and add a prefix to each (the variables are all numeric).
In Stata I would do something like (let's say the variables start with eq)
foreach var of varlist eq* {
recode var (1/4=1) (else=0), pre(r_)
}
How can I do this in SAS? I'd like to use the %DO macros, but I'm not familiar with them (I want to avoid SQL). I'd appreciate if you could include comments explaining each step!
SAS syntax for this would be easier if your variables are named using numeric suffix. That is, if you had ten variables with names of eq1, eq2, .... , eq10, then you could just use variable lists to define both sets of variables.
There are a number of ways to translate your recode logic. If we assume you have clean variables then we can just use a boolean expression to generate a 0/1 result. So if 4 and 5 map to 1 and the rest map to 0 you could use x in (4,5) or x > 3 as the boolean expresson.
data want;
set have;
array old eq1-eq10 ;
array new r_eq1-r_eq10 ;
do i=1 to dim(old);
new(i) = old(i) in (4,5);
end;
run;
If you have missing values or other complications you might want to use IF/THEN logic or a SELECT statement or you could define a format you could use to convert the values.
If your list of names is more random then you might need to use some code generation, such as macro code, to generate the new variable names.
Here is one method that use the eq: variable list syntax in SAS that is similar to the syntax of your variable selection before. Use PROC TRANSPOSE on an empty (obs=0) version of your source dataset to get a dataset with the variable names that match your name pattern.
proc transpose data=have(obs=0) out=names;
var eq: ;
run;
Then generate two macro variables with the list of old and new names.
proc sql noprint ;
select _name_
, cats('r_',_name_)
into :old_list separated by ' '
, :new_list separated by ' '
from names
;
quit;
You can then use the two macro variables in your ARRAY statements.
array old &old_list ;
array new &new_list ;
You can do this with rename and a dash indicating which variables you want to rename. Note the following only renames the col variables, and not the other one:
data have;
col1=1;
col2=2;
col3=3;
col5=5;
other=99;
col12=12;
run;
%macro recoder(dsn = , varname = , prefix = );
/*select all variables that include the string "varname"*/
/*(you can change this if you want to be more specific on the conditions that need to be met to be renamed)*/
proc sql noprint;
select distinct name into: varnames
separated by " "
from dictionary.columns where memname = upcase("&dsn.") and index(name, "&varname.") > 0;
quit;
data want;
set have;
/*loop through that list of variables to recode*/
%do i = 1 %to %sysfunc(countw(&varnames.));
%let this_varname = %scan(&varnames., &i.);
/*create a new variable with desired prefix based on value of old variable*/
if &this_varname. in (1 2 3) then &prefix.&this_varname. = 0;
else if &this_varname. in (4 5) then &prefix.&this_varname. = 1;
%end;
run;
%mend recoder;
%recoder(dsn = have, varname = col, prefix = r_);
PROC TRANSPOSE will give you good flexibility with regards to the way your variables are named.
proc transpose data=have(obs=0) out=vars;
var col1-numeric-col12;
copy col1;
run;
proc transpose data=vars out=revars(drop=_:) prefix=RE_;
id _name_;
run;
data recode;
set have;
if 0 then set revars;
array c[*] col1-numeric-col12;
array r[*] re_:;
call missing(of r[*]);
do _n_ = 1 to dim(c);
if c[_n_] in(1 2 3) then r[_n_] = 0;
else if c[_n_] in(4 5) then r[_n_] = 1;
else r[_n_] = c[_n_];
end;
run;
proc print;
run;
It would be nearly trivial to write a macro to parse almost that exact syntax.
I wouldn't necessarily use this - I like both the transpose and the array methods better, both are more 'SASsy' (think 'pythonic' but for SAS) - but this is more or less exactly what you're doing above.
First set up a dataset:
data class;
set sashelp.class;
age_ly = age-1;
age_ny = age+1;
run;
Then the macro:
%macro do_count(data=, out=, prefix=, condition=, recode=, else=, var_start=);
%local dsid varcount varname rc; *declare local for safety;
%let dsid = %sysfunc(open(&data.,i)); *open the dataset;
%let varcount = %sysfunc(attrn(&dsid,nvars)); *get the count of variables to access;
data &out.; *now start the main data step;
set &data.; *set the original data set;
%do i = 1 %to &varcount; *iterate over the variables;
%let varname= %sysfunc(varname(&dsid.,&i.)); *determine the variable name;
%if %upcase(%substr(&varname.,1,%length(&var_start.))) = %upcase(&var_start.) %then %do; *if it matches your pattern then recode it;
&prefix.&varname. = ifn(&varname. &condition., &recode., &else.); *this uses IFN - only recodes numerics. More complicated code would work if this could be character.;
%end;
%end;
%let rc = %sysfunc(close(&dsid)); *clean up after yourself;
run;
%mend do_count;
%do_count(data=class, out=class_r, var_start=age, condition= > 14, recode=1, else=0, prefix=p_);
The expression (1/4=1) means values {1,2,3,4} should be recoded into
1.
Perhaps you do not need to make new variables at all? If have variables with values 1,2,3,4,5 and you want to treat them as if they have only two groups you could do it with a format.
First define your grouping using a format.
proc format ;
value newgrp 1-4='Group 1' 5='Group 2' ;
run;
Then you can just use a FORMAT statement in your analysis step to have SAS treat your five level variable as it if had only two levels.
proc freq ;
tables eq: ;
format eq: NEWGRP. ;
run;
Why does my hash exceed the memory limits when I use the replace() method, when if I use the same code without the replace method the hash fits just fine? It seems like the hash would remain the same size either way. I am running the code on unix. In the code below, if I comment out ht.replace() the code runs fine. If I leave it in (don't have it commented out) then I receive a message saying "Hash object added 2490352 items when memory failure occurred." The "series" data set which is fed into the hash has 13 variables and 6912 rows. The "data1" dataset has 26970 rows and 4 columns. Is there any way to resolve this without messing the memsize?
data _null_;
if 0 then set series;
if _n_ = 1 then do;
declare hash ht(dataset:"series", ordered:"a", multidata:"yes");
rc = ht.defineKey("one", "two", "three");
rc = ht.defineData(all:"yes");
declare hiter hi("ht");
rc = ht.defineDone();
end;
set data1 end=eof;
rc = hi.first();
do while (rc = 0);
if low <= code1 <= high then do;
sum = sum + value1;
ht.replace();
end;
rc = hi.next();
end;
if eof then ht.output(dataset:"sum1");
run;
Probably, the problem is that your hash is multidata one, i.e. one key can correspond to many data items. For multidata hashes you have to use REPLACEDUP-method, unambiguously selecting not only a specific key, but also a specific data item within this key.
So your iterating over hash ht should look like this:
rc = hi.first();
do while (rc = 0);
rc=ht.find_next();
do while(rc=0);
if low <= code1 <= high then do;
sum = sum + value1;
ht.replacedup();
end;
rc=ht.find_next();
end;
rc = hi.next();
end;
I'm using SAS 9.2, and I got the following piece of code:
data success error;
length vague 3 path $150;
set foplist;
call symputx('error_count', rownum);
%if &&error&error_count = 0 %then %do;
path= "&&path&error_count";
vague=1;
output success;
%end;
%else %do;
...
%end;
run;
For each record I'd like to get the rownum, and combine it with another macro variable.
The rownum displays the rownumber of a record in the foplist dataset. For some reason I always get the last number in the dataset (probably because of macro compilation?)
For example:
A --- rownum=1
B --- rownum=2
I only get rownum=2
Any idea how to fix it?
Thanks!
You can't create and resolve a macro variable within the same datastep.
Have you already defined the macro variables ERROR1-ERRORx and PATH1-PATHn and wish to retrieve those values into the datastep based on rownum? i.e. to resolve &&ERROR&ERROR_COUNT.
If so, just use symexist / symget...
data success error ;
length vague 3 path $150 ;
set foplist ;
if symexist(cats('ERROR',rownum)) and symexist(cats('PATH',rownum)) then do ;
error_count = symget(cats('ERROR',rownum)) ;
if error_count = 0 then do ;
path = symget(cats('PATH',rownum)) ;
vague = 1 ;
output success ;
end ;
else output error ;
end ;
else output error ;
run;
I have a table called term_table containing the below columns
comp, term_type, term, score, rank
I go through every observation and at each obs, I want to store the value of variable rank to a macro variable called curr_r. The code I created below does not work
Data Work.term_table;
input Comp $
Term_type $
Term $
Score
Rank
;
datalines;
comp1 term_type1 A 1 1
comp2 term_type2 A 2 10
comp3 term_type3 A 3 20
comp4 term_type4 B 4 20
comp5 term_type5 B 5 40
comp6 term_type6 B 6 100
;
Run;
%local j;
DATA tmp;
SET term_table;
LENGTH freq 8;
BY &by_var term_type term;
RETAIN freq;
CALL SYMPUT('curr_r', rank);
IF first.term THEN DO;
%do j = 1 %to &curr_r;
do some thing
%end;
END;
RUN;
Could you help me to solve the problem
Thanks a lot
Hung
The call symput statement does create the macro var &curr_r with the value of rank, but it is not available until after the data step.
However, I don't think you need to create the macro var &curr_r. I don't think a macro is needed at all.
I think the below should work: (Untested)
DATA tmp;
SET term_table;
LENGTH freq 8;
BY &by_var term_type term;
RETAIN freq;
IF first.term THEN DO;
do j = 1 to rank;
<do some thing>
end;
END;
RUN;
If you needed to use the rank from a prior obs, use the LAG function.
Start=Lag(rank);
To store each value of RANK in a macro variable, the below will do that:
Proc Sql noprint;
select count(rank)
into :cnt
from term_table;
%Let cnt=&cnt;
select rank
into :curr_r1 - :curr_r&cnt
from term_table;
quit;