I'm trying to send out ID's of a SAS dataset in an email but not able to get the format right. I just need plain text as html is getting stuck and slow. Thanks in advance to help! Any one solution would be good.
I have one ID column. The first solution one gives a complete list like
%include "/saswrk/go/scripts/envsetup.sas";
filename mymail email "&emaillist."
subject=" &env. Records Transferred on %sysfunc(date(),yymmdd10.)";
data null;
set WORKGO.recds_processed;
file mymail;
put (_all_)(=);
run; quit;
Output
ID=1
ID=2
ID=3
ID=4
ID=5
It would be nice if i could get the count and output like
Number of records processed=6 and the ID's are 1,2,3...
Try this:
%include '/saswrk/go/scripts/envsetup.sas';
filename mymail email "&emaillist"
subject = "&env Records Transferred on %sysfunc(date(), yymmdd10.)";
data _null_;
length id_list $ 3000;
retain id_list '';
set workgo.recds_processed nobs = nobs end = eof;
file mymail;
if _n_ = 1 then do;
put 'Number of records processed=' nobs;
put 'The IDs are:';
end;
/* Print the IDs in chunks */
if length(strip(id_list)) > 2000 then do;
put id_list;
call missing(id_list);
end;
call catx(', ', id_list, id);
if eof then put id_list;
run;
At the first iteration of the data step, when _n_ = 1, the number of observations in the dataset is written to the file.
Then, at each iteration, the current ID is appended to a comma-separated list of IDs. When the length of the list exceeds 2,000, the contents of the list is printed and the list is reset to empty. This ensures that the maximum length of a SAS character string is never reached, thereby avoiding errors.
When the end of the input dataset is reached, the current contents of the list is output.
This will give you multiple chunks of comma-delimited IDs where each chunk is separated by a newline.
To append the IDs from a second dataset you can simply modify the existing file using the mod keyword in the file statement and write the observations in an otherwise identical manner.
data _null_;
length id_list $ 3000;
retain id_list '';
set workgo.recds_not_processed nobs = nobs end = eof;
file mymail mod;
if _n_ = 1 then do;
put 'Number of records not processed=' nobs;
put 'The IDs are:';
end;
/* Print the IDs in chunks */
if length(strip(id_list)) > 2000 then do;
put id_list;
call missing(id_list);
end;
call catx(', ', id_list, id);
if eof then put id_list;
run;
Related
I am trying to calculate how long a child has been in foster care. However, I am having some issues. My data should look like something below:
For each individual (ID) I need to calculate the duration (end_date-start_date). However, I also need to apply a rule that states that if there are less than 5 days between the end date and the start date within the same type of foster care, it should be considered as one consecutively placement. If there are more than five days between the end date end the start date within the same type of foster care for the same individual, it is a new placement. If it is a new type of foster care, it is a new placement. The variable “duration” is how, it is supposed to be calculated.
I have tried the following code, but it doesn't work the proper way + I don't know how to apply my "five day"-rule.
Proc sort data=have out=want;
by id type descending start_date;
run;
Data want;
set want;
by id type;
retain last_date;
if first.id or first.type then do;
last_date=end_date;
end;
if last.id or last.type then duration=(end_date-start_date);
run;
Any help is much appreciated!
Using a bunch of retain statements here to achieve this:
data want;
set have;
by id ;
retain true_sd prev_ed prev_type;
if first.id then call missing(prev_type);
if type ~= prev_type then do;
true_sd = sd;
call missing(prev_ed);
call missing(prev_type);
end;
if sd - prev_ed > 5 then true_sd = sd;
duration = ed - true_sd;
output;
prev_type = type;
prev_ed = ed;
format sd ed true_sd prev_ed date.;
run;
(assuming type and id are numeric here. ed is end_date, sd is start_date)
I have a customer survey data like this:
data feedback;
length customer score comment $50.;
input customer $ score comment & $;
datalines;
A 3 The is no parking
A 5 The food is expensive
B . I like the food
C 5 It tastes good
C . blank
C 3 I like the drink
D 4 The dessert is tasty
D 2 I don't like the service
;
run;
There is a macro code like this:
%macro subset( cust=);
proc print data= feedback;
where customer = "&cust";
run;
%mend;
I am trying to write a program that call the %subset for each customer value in feedback data. Note that we do not know how many unique values of customer there are in the data set. Also, we cant change the %subset code.
I tried to achieve that by using proc sql to create a unique list of customers to pass into macro code but I think you cannot pass a list in a macro code.
Is there a way to do that? p.s I am beginner in macro
I like to keep things simple. Take a look at the following:
data feedback;
length customer score comment $50.;
input customer $ score comment & $;
datalines;
A 3 The is no parking
A 5 The food is expensive
B . I like the food
C 5 It tastes good
C . blank
C 3 I like the drink
D 4 The dessert is tasty
D 2 I don't like the service
;
run;
%macro subset( cust=);
proc print data= feedback;
where customer = "&cust";
run;
%mend subset;
%macro test;
/* first get the count of distinct customers */
proc sql noprint;
select count(distinct customer) into : cnt
from feedback;quit;
/* do this to remove leading spaces */
%let cnt = &cnt;
/* now get each of the customer names into macro variables
proc sql noprint;
select distinct customer into: cust1 - :cust&cnt
from feedback;quit;
/* use a loop to call other macro program, notice the use of &&cust&i */
%do i = 1 %to &cnt;
%subset(cust=&&cust&i);
%end;
%mend test;
%test;
of course if you want short and sweet you can use (just make sure your data is sorted by customer):
data _null_;
set feedback;
by customer;
if(first.customer)then call execute('%subset(cust='||customer||')');
run;
First fix the SAS code. To test if a value is in a list using the IN operator, not the = operator.
where customer in ('A' 'B')
Then you can pass that list into your macro and use it in your code.
%macro subset(custlist);
proc print data= feedback;
where customer in (&custlist);
run;
%mend;
%subset(custlist='A' 'B')
Notice a few things:
Use quotes around the values since the variable is character.
Use spaces between the values. The IN operator in SAS accepts either spaces or comma (or both) as the delimiter in the list. It is a pain to pass in comma delimited lists in a macro call since the comma is used to delimit the parameters.
You can defined a macro parameter as positional and still call it by name in the macro call.
If the list is in a dataset you can easily generate the list of values into a macro variable using PROC SQL. Just make sure the resulting list is not too long for a macro variable (maximum of 64K bytes).
proc sql noprint;
select distinct quote(trim(customer))
into :custlist separated by ' '
from my_subset
;
quit;
%subset(&custlist)
I have a dataset of CASE_ID (x y and z), a set of multiple dates (including duplicate dates) for each CASE_ID, and a variable VAR. I would like to create a dummy variable DUMMYVAR by group within a group whereby if VAR="C" for CASE_ID x on some specific date, then DUMMYVAR=1 for all observations corresponding to CASE_ID x on with that date.
I believe that a Classic 2XDOW would be the key here but this is my third week using SAS and having difficulty getting this by two BY groups here.
I have referenced and attempted to write a variation of Haikuo's code here:
PROC SORT have;
by CASE_ID DATE;
RUN;
data want;
do until (last.DATE);
set HAVE;
by date notsorted;
if var='c' then DUMMYVAR=1;
do until (last.DATE);
set HAVE;
by DATE notsorted;
if DATE=1 then ????????
end;
run;
Change your BY statements to match the grouping you are doing. And in the second loop add a simple OUTPUT; statement. Then your new dataset will have all the rows in your original dataset and the new variable DUMMYVAR.
data want;
do until (last.DATE);
set HAVE;
by case_id date;
if var='c' then DUMMYVAR=1;
end;
do until (last.DATE);
set HAVE;
by case_id date;
output;
end;
run;
This will create the variable DUMMYVAR with values of either 1 or missing. If you want the values to be 1 or 0 then you could either set it to 0 before the first DO loop. Or add if first.date then dummyvar=0; statement before the existing IF statement.
I'm working on a very big data set, (more than 100 variables and 11 millions observations). In this data set, i have a variable named DTDSI (simulation date) in DATE9. format. (For example: 01APR2015 , 02MAR2015...). I have a macro-program to analyse this data set by comparing the observations in 2 different months:
%macro analysis (data_input , m , m_1);
.....
%mend;
The 2 macro-variables m and m_1 are months that i want to compare. Their format is MONYY7.(APR2015 , MAR2015...). Keep in mind that i cannot modify my data_input (its the data of my company). In the beginning of my macro program, i want to create a new data set with only the observations of the &m and &m_1 month. I can easily do that by creating a new date variable from DTDSI (real_month for ex) but in the format MONYY7. Then i just select the observations where real_month equal &m or real_month equal &m:
Data new;
Set &data_input;
mois_real = input(DTDSI,MONYY7);
RUN;
PROC SQL;
CREATE TABLE NEW AS;
SELECT *
WHERE mois_real in ("&m" , "&m_1")
FROM NEW;
....
The problem is that in my first Data Statement, i duplicated my data_input; which is bad because it took 30 minutes. So can you tell me how can i make my selection (DTDSI = m and DTDSI=m_1) right in my first Statement?
You can use formula's in your where/if condition, so apply your formula from step 1 into step 2 or vice versa.
Data new;
set &data_input;
WHERE put(DTDSI,MONYY7) in ("&m" , "&m_1");
run;
I've come in late to a project and want to write a macro that normalises some data for export to a SQL Server.
There are two control tables...
- Table 1 (customers) has a list of customer unique identifiers
- Table 2 (hierarchy) has a list of table names
There are then n additional tables. One for each record in (hierarchy) (named in the SourceTableName field). With the form of...
- CustomerURN, Value1, Value2
I want to combine all of these tables into a single table (sample_results), with the form of...
- SourceTableName, CustomerURN, Value1, Value2
The only records that should be copied, however, should be for CustomerURNs that exist in the (customers) table.
I could do this in a hard coded format using proc sql, something like...
proc sql;
insert into
SAMPLE_RESULTS
select
'TABLE1',
data.*
from
Table1 data
INNER JOIN
customers
ON data.CustomerURN = customers.CustomerURN
<repeat for every table>
But every week new records are added to the hierarchy table.
Is there any way to write a loop that picks up the table name from the hierarchy table, then calls the proc sql to copy the data into sample_results?
You could concatenate all the hierarchy tables together, and do a single SQL join
proc sql ;
drop table all_hier_tables ;
quit ;
%MACRO FLAG_APPEND(DSN) ;
/* Create new var with tablename */
data &DSN._b ;
length SourceTableName $32. ;
SourceTableName = "&DSN" ;
set &DSN ;
run ;
/* Append to master */
proc append data=&DSN._b base=all_hier_tables force ;
run ;
%MEND ;
/* Append all hierarchy tables together */
data _null_ ;
set hierarchy ;
code = cats('%FLAG_APPEND(' , SourceTableName , ');') ;
call execute(code); /* run the macro */
run ;
/* Now merge in... */
proc sql;
insert into
SAMPLE_RESULTS
select
data.*
from
all_hier_tables data
INNER JOIN
customers
ON data.CustomerURN = customers.CustomerURN
quit;
Another way is to create a view so that it will always reflect the latest data in the metadata tables. The call execute function is used to read in the table names from the hierarchy dataset. Here is an example which you should be able to modify to suit your data, the last bit of code is the relevant one to you.
data class1 class2 class3;
set sashelp.class;
run;
data hierarchy;
input table_name $;
cards;
class1
class2
class3
;
run;
data ages;
input age;
cards;
11
13
15
;
run;
data _null_;
set hierarchy end=last;
if _n_=1 then call execute('proc sql; create view sample_results_view as ' );
if not last then call execute('select * from '||trim(table_name)||' where age in (select age from ages) union all ');
if last then call execute('select * from '||trim(table_name)||' where age in (select age from ages); quit;');
run;