I am having a question on SAS Macro (I do analytics in R and python, No SAS). SO, it is getting me into some lack of understanding in syntax of SAS in solving the following question.
Write a macro that accepts a table name, a column name, a list of integers, a main axis label and an x axis label. This function should scan over each element in the list of integers and produce a histogram for each integer value, setting the bin count to the element in the input list, and labeling main and x-axis with the specified parameters. You should label the y-axis to read Frequency, bins = and the number of bins.
Also I need to test macro with a data set, using bin numbers 12, 36, and 60. So, that I am able to call macro with something like
%plot_histograms(data, y, 12 36 60, main="Title", xlabel="x_label");
to plot three different histograms of the data set.
Hint: Assume 12 36 60 resolve to a single macro parameter and use %scan, macro definition can look something like
%macro plot_histograms(table_name, column_name, number_of_bins, main="Main", xlabel="X Label")
Thanks in Advance.
I don't fully understand your question and this is not a free code platform anyway but this should point you in the right direction
%macro plot_histograms(table_name, column_name, number_of_bins, main="Main", xlabel="X Label");
%do i=1 %to %sysfunc(countw(&number_of_bins.); /* loop accross elements in your input list */
proc gchart data=&table_name.; /*make a chart for the provided table */
...
/* whatever it is you actually need to do, fetch the current element of the input list like this */
%scan(&number_of_bins.,&i.)
...
run;
%end;
%mend;
First, you really should try this on your own and let us know where you get stuck.
That said, let's break down how to solve this problem.
Make some test data;
data test;
do i=1 to 10000;
r = rannor(1);
output;
end;
run;
How do I create a histogram with this? Use PROC SGPLOT
proc sgplot data=test;
histogram r / nbins=10;
xaxis label="X LABEL";
yaxis label="Y LABEL";
run;
Produces this:
So, if I make a macro to create this generally:
%macro histogram(data,column,bin,xlabel,ylabel);
proc sgplot data=&data;
histogram &column / nbins=&bin;
xaxis label="&xlabel";
yaxis label="&ylabel";
run;
%mend;
Now %histogram(test,r,10,X LABEL,Y LABEL)' produces the same image.
Let's write something that loops over the values of bins and call this macro:
%macro make_histograms(data,column,bins,xlabel,ylabel);
%local i n bin;
%let n=%sysfunc(countw(&bins)); /*Number of words in &bins*/
%do i=1 %to &n;
%let bin=%scan(&bins,&i); /*Get the nth bin*/
%histogram(&data,&column,&bin,&xlabel,&ylabel);
%end;
%mend;
Related
I have two plots that I would like to merge into one. Each plot represents the proportion of present / not-present observations by their corresponding cumulative test results for the year
So on the plot I would like to see bars, side by side for groups of test scores but counting number of present to not-present
To represent this problem, this is what I have currently:
data test_scores;
do i = 1 to 200;
score = ranuni(200);
output;
end;
drop i;
run;
data test_scores_2;
set test_scores;
if _n_ le 100 then flag = 0;
else flag = 1;
run;
data test_scores_2_0 test_scores_2_1;
set test_scores_2;
if flag = 0 then output test_scores_2_0;
else if flag = 1 then output test_scores_2_1;
run;
PROC GCHART
DATA=test_scores_2_0
;
VBAR
score
/
CLIPREF
FRAME
LEVELS=20
TYPE=PCT
COUTLINE=BLACK
RAXIS=AXIS1
MAXIS=AXIS2
;
RUN;
QUIT;
PROC GCHART
DATA=test_scores_2_1
;
VBAR
score
/
CLIPREF
FRAME
LEVELS=20
TYPE=PCT
COUTLINE=BLACK
RAXIS=AXIS1
MAXIS=AXIS2
;
RUN;
QUIT;
bars should sum up to 100% for present
bars should sum up to 100% for non-present
TIA
proc sgplot to the rescue. Use the group= option to specify two separate groups. Set the transparency to 50% so one histogram does not cover the other.
proc sgplot data=test_scores_2;
histogram score / group=flag transparency=0.5 binwidth=.05;
run;
With Proc GCHART you can use VBAR options GROUP= and G100 to get bars that represent percent within group. This is useful when the groups have different counts.
The SUBGROUP= option splits the vertical bar according to the different values of the subgroup variable, and produces automatic coloration and legend corresponding to the subgroups.
When the SUBGROUP variable (or values) correspond 1:1 to the group the result is a chart with a different color for each group and a legend corresponding to the group.
For example, modify your data so group 1 has a 50 count and group 2 has 150 count:
data test_scores;
do _n_ = 1 to 200;
score = ranuni(200);
flag = _n_ > 50;
output;
end;
run;
axis1 label=("score");
axis2 ;
axis3 label=none value=none;
PROC GCHART data=test_scores;
VBAR score
/ levels=10
GROUP=flag G100
SUBGROUP=flag
SPACE=0 TYPE=PERCENT freq gaxis=axis3 maxis=axis1 ;
run;
Output
Similar chart showing the effect of a subgroup variable with values different than group values.
data test_scores;
do _n_ = 1 to 200;
subgroup = ceil(5 * ranuni(123)); * random 1 to 5;
score = ranuni(200);
flag = _n_ > 50;
output;
end;
run;
axis1 label=("score");
axis2 ;
axis3 label=none value=none;
PROC GCHART data=test_scores;
VBAR score
/ levels=10
GROUP=flag G100
SUBGROUP=subgroup /* has integer values in [1,5] */
SPACE=0 TYPE=PERCENT freq gaxis=axis3 maxis=axis1;
run;
I need to use cycles in a sas macro that writes a data step
I have a code that should work but it doesn't. How can i fix it?
%macro ci;
data
%do i=1 %to 3;
_z%sysfunc(putn(%eval(&i),z2.)) ;
%end;
;
set _06;
%do i=1 %to 3;
if num="%sysfunc(putn(%eval(&i),z2.))" then output _z%sysfunc(putn(%eval(&i),z2.));
%end;
run;
%mend;
%ci;
I'd like to get the following output:
data
_z01
_z02
_z03;
set _06 ;
if num="01" then output _z01;
if num="02" then output _z02;
if num="03" then output _z03;
run;
You are very close. You simply had an extra ; in your first loop.
You need to change:
data
%do i=1 %to 3;
_z%sysfunc(putn(%eval(&i),z2.)) ;
%end;
;
to:
data
%do i=1 %to 3;
_z%sysfunc(putn(%eval(&i),z2.))
%end;
;
Adding option mprint; to the beginning of your code would show you the code that was generated from your macro statement and helped you to debug it.
how about If you would use a simpler approach instead of using converting the number to character
data _06;
num='01';
output;
num='02';
output;
num='03';
output;
run;
%macro ci;
data
%do i=1 %to 3;
_z0&i
%end;
;
set _06;
%do i=1 %to 3;
if num="0&i" then output _z0&i;
%end;
run;
%mend;
%ci;
I have written a macro which takes multiple datasets and the variables common with those datasets and generates a frequency table using proc freq, as follows:
%macro f(input= , vars= );
%let n_d=%sysfunc(countw(&input));
%do i = 1 %to &n_d;
%let dataset = %scan(&input, &i);
%let n=%sysfunc(countw(&vars));
%do j = 1 %to &n;
%let values = %scan(&vars, &j);
title "Frequency of &dataset and &values";
proc freq data = &dataset;
tables &values/nocum;
run;
%end;
%end;
%mend;
I work with UNIX SAS and my version of SAS doesn't have access to HTML output because of some network issues.
I want to create a pdf output and for each of the above frequency tables and store it either in a single pdf or in a multiple pdf's(not too particular on that). Please help!!
You can sandwich the code between ODS PDF file='' and ods pdf close. Where you place the code determines if you get a single or multiple files.
For example, to generate a single file, put it at the outmost loop:
%macro f(input= , vars= );
ods pdf file="myoutout.pdf" style=meadow;
%let n_d=%sysfunc(countw(&input));
%do i = 1 %to &n_d;
%let dataset = %scan(&input, &i);
%let n=%sysfunc(countw(&vars));
%do j = 1 %to &n;
%let values = %scan(&vars, &j);
title "Frequency of &dataset and &values";
proc freq data = &dataset;
tables &values/nocum;
run;
%end;
%end;
ods pdf close;
%mend;
I am generating random sequences of numbers using the same seed:
sprev = rng(2,'v5uniform');
for i=1:N
%do some operations
rndIDX = randperm(sampleSize) ;
newdata= data(rndIDX(1:newSampleSize), :) ;
if x>y
remove=x1; %line 7
end
for l=1:M
%do something else
if l>xy & ~empty(remove)
%do something related to remove
elseif l>xy
%do nothing
else
%do something not related to remove
end
%more code here
end
end
However, when I comment out line 7, rndIDX returns a different sequence of numbers which for me is unexpected. There might be a bug somewhere in the code but I am not sure what is the relationship between the sequence produced by randperm and the code that follows. Also, if i keep the code as presented I always get the same newdata which is the expected behaviour. I just want to comment out line 7 and still get the same newdata.I can confirm that sampleSize is always the same as well as newSampleSize for both cases.
I want to mark each value that comes out of my loop with a value.
Say I have a variable number of values that come out of each iteration. I want those values to be labeled by which iteration they came out of.
like
1-1,
2-1,
3-1,
1-2,
2-2,
3-2,
4-2,
etc.
where the first number is the value from the loop and the second is counting which iteration it came from.
I feel like there is a way I just cant find it.
ok so here is some code.
for c=1:1:npoints;
for i=1:1:NN;
if ((c-1)*spacepoints)<=PL(i+1) && ((c-1)*spacepoints)>=PL(i);
local(c)=((c)*spacepoints)-PL(i);
end
if ((c-1)*spacepoints)>=PL(NN);
local(c)=((c)*spacepoints)-PL(NN);
element(i)=NN;
end
end
I want to mark each local value with the iteration it came from for the i:NN. PL is a vector and the output is a set of vectors for each iteration.
For this sort of quick problem I like to create a cell array:
for k = 1:12
results{k} = complicated_function(...);
end
If the output is really complicated, then I return a struct with fields relating to the outputs:
for k = 1:12
results{k}.file = get_filename(...);
results{k}.result = ...;
end
Currently as it is right now, in your inner 1:NN loop, your local(c) variable is being updated or overwritten. You never apply the previous value of local, so it is not some iterative optimization algorithm(?)...
Perhaps an easy solution is to change the size/type of local from a vector to a matrix. Let's say that local is of size [npoints 1]. Instead you make it of size [npoints NN]. It is now a 2d-array (a matrix of npoints rows and NN columns). use the second dimension to store each (assumed column) vector from the inner loop:
local = zeros([npoints NN]);
%# ... code in bewteen ...
for c=1:1:npoints;
for i=1:1:NN;
if ((c-1)*spacepoints)<=PL(i+1) && ((c-1)*spacepoints)>=PL(i);
local(c, i)=((c)*spacepoints)-PL(i);
end
if ((c-1)*spacepoints)>=PL(NN);
local(c, i)=((c)*spacepoints)-PL(NN);
element(i)=NN;
end
end
end
The c'th row of your local matrix will then corresponds to the NN values from the inner loop. Please note that I have assumed your vector to be a column vector - if not, just change the order of the sizes.