SAS - Find Palindromes for all Datasets in a Directory - macros

Task:
I need to identify all palindromes within a directory. I use a proc contents and proc sort to identify the datasets within a directory, like so:
proc contents data = dPath._all_ out = dFiles (keep = memname);
run;
proc sort data = dFiles nodupkey; by memname;run;
I want to identify palindromes within this directory.
Issue:
I plan to use macros because I need to do this for all datasets within a directory. So, instead of the user inputting the string to check if there is a palindrome, I need that to be done dynamically, i.e. identify any palindromes within a dataset.
Updates:
As you can see in the above pictures, I am able to successfully flag the palindromes for case sensitive and case insensitive situations. I would like to output the specific element that is a palindrome to a separate dataset. Currently, I am only able to output the entire row with the palindrome in it.
Code:
data palindrome_set (drop = i) palindrome_case_sensitive palindrome_case_insensitive;
set reverse_rows;
array palindrome[*] _all_ ;
do i = 1 to dim(palindrome);
palindrome_cs = (trim(palindrome[i]) eq reverse(trim(palindrome[i])));
/* if palindrome_cs = 1 then output palindrome[i]; WANT TO OUTPUT SPECIFIC ELEMENT, NOT ENTIRE ROW*/
palindrome_cis = (lowcase(trim(palindrome[i])) eq reverse(lowcase(trim(palindrome[i]))));
end;
output palindrome_set;
if palindrome_cs = 1 then output palindrome_case_sensitive; *WANT TO OUTPUT SPECIFIC ELEMENT, NOT ENTIRE ROW;
if palindrome_cis = 1 then output palindrome_case_insensitive; *WANT TO OUTPUT SPECIFIC ELEMENT, NOT ENTIRE ROW;
run;

If memtype ="DATA" then the Memname in your code will hold the table names only.
To check palindromes in table names using your code above; try:
%macro palindrome (parameter = );
%let string = %sysfunc(reverse(%sysfunc(compress("&parameter ",,sp);
%let reverse = %sysfunc(compress(["&parameter ");
%if %upcase(&string.) = %upcase(&reverse.) %then %do;
ods output = "/palindrome"
%end;
data work.palindromes;
set work.dfiles;
%macro palindrome (parameter = Memname);
run;

Not sure why your image showed a reversal of the variable names as well.
The underlying variable name corresponding to an array reference can be retrieved using the data step function VNAME(). Also, the formatted value of a variable can be obtained using the data step function VVALUE. Both these functions have a dynamic version -- VNAMEX and VVALUEX. An array based solution will not need to utilize the X versions of the functions.
Processing all variables via an array is a little tricky because you need additional variables to perform the processing, and you don't want those tested for palindromicity. In this example, worker variable names use the convention of starting with _pal in the hopes of avoiding variable name collision with the data sets being processed. The example processes a single data set, but it should be obvious how to macro-ize the code and have it work for a data set name that is passed.
data want(keep=_palds_ _palrow_ _palvar_ _palval_);
set sashelp.class;
array _pals_ _character_; * array elements are those character variables in the pdv at this point in the data step;
array _palx_ _numeric_; * array elements are those numeric variables in the pdv at this point in the data step;
attrib
_palds_ length = $42
_palrow_ length = 8
_palvar_ length = $32
_palval_ length = $500
;
* check raw character value;
do _palindex_ = 1 to dim(_pals_);
if length (_pals_(_palindex_)) > lengthm(_palval_) then do;
_palvar_ = vname (_pals_(_palindex_));
put "NOTE: sashelp.class " _n_= _palVar_ " had a value that is longer than _palval_ container";
continue;
end;
if _pals_(_palindex_) = reverse(trim(_pals_(_palindex_))) then do;
_palds_ = "sashelp.class";
_palrow_ = _n_;
_palvar_ = vname (_pals_(_palindex_));
_palval_ = _pals_(_palindex_);
end;
end;
* check formatted numeric value;
do _palindex_ = 1 to dim(_palx_);
if left(vvalue(_palx_(_palindex_))) = reverse(trim(vvalue(_palx_(_palindex_)))) then do;
_palds_ = "sashelp.class";
_palrow_ = _n_;
_palvar_ = vname (_palx_(_palindex_));
_palval_ = _palx_(_palindex_);
end;
end;
run;
A macro that wants to explicitly avoid name collision must perform some navel-gazing on the data set to be processed in order to generate worker variable names that do not collide.
Processing all members of the libref can be very resource intensive if the libref connects to a remote host -- so a robust solution may want skip over those.
Some other approaches:
use CALL VNEXT routine to iterate through the pdv variables
use the dictionary table or proc contents output as a basis for generating a wall of variable tests in a data step that does not rely on arrays.

Related

How Can I Do a Fuzzy Character Merge using Hash Objects in SAS?

I'm testing out how to use hash objects in SAS 9.4 M6 to do fuzzy joins since PROC SQL just runs for hours on my larger dataset. I created some sample datasets (below) and what I want is for the merge to pull in exact matches on the "name" fields AND any matches that have a COMPLEV score less than 10. Right now, this code still only pulls in the exact matches.
I'm very new to hash objects so I'm sure it's a simple fix but I've tried am in need of help.
data A;
infile datalines missover;
length nameA $50;
input nameA $ ;
datalines;
MICKEYMOUSE2000-01-02
DAFDUCK1990-09-23
GOOFYMAN1993-05-11
;
run;
*second dataset with one exact match and two that differ slightly from those in dataset A;
data B;
infile datalines missover;
length nameB $50;
input nameB $ VDAY :ddmmyy10.;
format VDAY ddmmyy10.;
datalines;
MICKEYMOUSE2000-01-01 07/08/2021
DAFFYDUCK1990-09-23 05/11/2021
GOOFYMAN1993-05-11 08/11/2021
;
run;
*only pulling in exact matches, want it to pull in other fuzzy matches;
data simplemerge ;
if 0 then set work.B ; *load var properties into hash table;
if _n_ = 1 then do;
dcl hash B (dataset: 'work.B'); *declare the name B for hash using B dataset;
B.definekey('nameB');*identify var in B to use as key;
B.definedata('nameB','vday');*identify columns of data to bring in from B dataset;
B.definedone();*complete hash table definition;
end;
set work.A; *bring in A data;
if B.Find(KEY: nameA) ne 0 then do;
if complev(nameA, nameB) < 10 then do;
B.ref(key : nameB,data : nameB, data : vday);
end;
end;
RUN;
Fuzzy match in hash is not necessarily better than SQL - in fact, good chance it's identical. SQL joins are often done with a hash table behind the scenes.
That said, here's how you'd do the naive hash fuzzy lookup - with a hash iterator (hiter).
data simplemerge ;
if 0 then set work.B ; *load var properties into hash table;
if _n_ = 1 then do;
dcl hash B (dataset: 'work.B'); *declare the name B for hash using B dataset;
B.definekey('nameB');*identify var in B to use as key;
B.definedata('nameB','vday');*identify columns of data to bring in from B dataset;
B.definedone();*complete hash table definition;
dcl hiter hi_b('B');
end;
set work.A; *bring in A data;
done = 0;
if B.Find(KEY: nameA) ne 0 then do;
rc = hi_b.first();
do while (rc eq 0 );
if complev(nameA, nameB) lt 10 then do;
put "Found one!" namea= nameb=;
leave;
end;
else call missing(of nameB vday);
rc = hi_b.next();
end;
end;
else put "Found one!" namea=;
RUN;
This will be ... not fast ... if work.B has a lot of rows. It goes over every row of B once for every row of A that doesn't have an exact match.
One thing you can do to make this more efficient is not search all of B. Instead, have some smaller subset of B that you find with an exact match, and then iterate over that smaller subset; instead of using hiter just use the find_next. This may not work for your exact requirements, but if it's feasible, this would be ideal.
Here's one example of doing that. It's not particularly efficient since sex has only two values (so I'm looking through half of the rows anyway), but it does work.
data have;
set sashelp.class;
if mod(_n_,3) eq 0 then do;
name = cats(name,'Z');
end;
if mod(_n_,5) eq 0 then do;
name = cats('Row_',_n_);
end;
run;
data want;
if 0 then set sashelp.classfit;
length name_fuzz $8;
*first define two hash tables - one for exact match, one for fuzzy. Only do this if exact matches are reasonably common;
if _n_ eq 1 then do;
declare hash h_exact(dataset:'sashelp.classfit');
h_exact.defineKey('name');
h_exact.defineData(all:'Y');
h_exact.defineDone();
declare hash h_fuzzy(dataset:'sashelp.classfit(rename=name=name_fuzz keep=name sex predict lower upper lowermean uppermean)',multidata:'y');
h_fuzzy.defineKey('sex');
h_fuzzy.defineData(all:'Y');
h_fuzzy.defineDone();
call missing(name_fuzz);
end;
set have;
*now check exact - if it matches then do not try further;
rc_exact = h_exact.find();
if rc_exact eq 0 then do;
output;
return;
end;
*now try fuzzy - first look up the first match by the chunk criteria;
rc_fuzzy = h_fuzzy.find();
*now iterate over all of the matches of the chunk, and if you find a close-enough match then output that row and stop trying;
do while (rc_fuzzy eq 0);
if complev(name,name_fuzz) lt 2 then do;
output;
return;
end;
rc_fuzzy = h_fuzzy.find_next();
end;
*if you are still here, you failed to find a fuzzy match - so clear the values from the variables you are merging on and output a blank row, assuming you are doing a "left join" [if it is inner join, then just skip these next two lines];
call missing(of predict lower upper lowermean uppermean);
output;
run;
A better version of this would have a more discriminating key for the fuzzy match - the more discriminating the better. The key might not be something related at all to your fuzzy match - for example, maybe your fuzzy match is looking for names, but you also their year of birth. Match on year of birth, then iterate over complev(namea,nameb), since year of birth is quite discriminating.

recode and add prefix to sas variables

Lets's say I have a bunch of variables named the same way and I'd like to recode them and add a prefix to each (the variables are all numeric).
In Stata I would do something like (let's say the variables start with eq)
foreach var of varlist eq* {
recode var (1/4=1) (else=0), pre(r_)
}
How can I do this in SAS? I'd like to use the %DO macros, but I'm not familiar with them (I want to avoid SQL). I'd appreciate if you could include comments explaining each step!
SAS syntax for this would be easier if your variables are named using numeric suffix. That is, if you had ten variables with names of eq1, eq2, .... , eq10, then you could just use variable lists to define both sets of variables.
There are a number of ways to translate your recode logic. If we assume you have clean variables then we can just use a boolean expression to generate a 0/1 result. So if 4 and 5 map to 1 and the rest map to 0 you could use x in (4,5) or x > 3 as the boolean expresson.
data want;
set have;
array old eq1-eq10 ;
array new r_eq1-r_eq10 ;
do i=1 to dim(old);
new(i) = old(i) in (4,5);
end;
run;
If you have missing values or other complications you might want to use IF/THEN logic or a SELECT statement or you could define a format you could use to convert the values.
If your list of names is more random then you might need to use some code generation, such as macro code, to generate the new variable names.
Here is one method that use the eq: variable list syntax in SAS that is similar to the syntax of your variable selection before. Use PROC TRANSPOSE on an empty (obs=0) version of your source dataset to get a dataset with the variable names that match your name pattern.
proc transpose data=have(obs=0) out=names;
var eq: ;
run;
Then generate two macro variables with the list of old and new names.
proc sql noprint ;
select _name_
, cats('r_',_name_)
into :old_list separated by ' '
, :new_list separated by ' '
from names
;
quit;
You can then use the two macro variables in your ARRAY statements.
array old &old_list ;
array new &new_list ;
You can do this with rename and a dash indicating which variables you want to rename. Note the following only renames the col variables, and not the other one:
data have;
col1=1;
col2=2;
col3=3;
col5=5;
other=99;
col12=12;
run;
%macro recoder(dsn = , varname = , prefix = );
/*select all variables that include the string "varname"*/
/*(you can change this if you want to be more specific on the conditions that need to be met to be renamed)*/
proc sql noprint;
select distinct name into: varnames
separated by " "
from dictionary.columns where memname = upcase("&dsn.") and index(name, "&varname.") > 0;
quit;
data want;
set have;
/*loop through that list of variables to recode*/
%do i = 1 %to %sysfunc(countw(&varnames.));
%let this_varname = %scan(&varnames., &i.);
/*create a new variable with desired prefix based on value of old variable*/
if &this_varname. in (1 2 3) then &prefix.&this_varname. = 0;
else if &this_varname. in (4 5) then &prefix.&this_varname. = 1;
%end;
run;
%mend recoder;
%recoder(dsn = have, varname = col, prefix = r_);
PROC TRANSPOSE will give you good flexibility with regards to the way your variables are named.
proc transpose data=have(obs=0) out=vars;
var col1-numeric-col12;
copy col1;
run;
proc transpose data=vars out=revars(drop=_:) prefix=RE_;
id _name_;
run;
data recode;
set have;
if 0 then set revars;
array c[*] col1-numeric-col12;
array r[*] re_:;
call missing(of r[*]);
do _n_ = 1 to dim(c);
if c[_n_] in(1 2 3) then r[_n_] = 0;
else if c[_n_] in(4 5) then r[_n_] = 1;
else r[_n_] = c[_n_];
end;
run;
proc print;
run;
It would be nearly trivial to write a macro to parse almost that exact syntax.
I wouldn't necessarily use this - I like both the transpose and the array methods better, both are more 'SASsy' (think 'pythonic' but for SAS) - but this is more or less exactly what you're doing above.
First set up a dataset:
data class;
set sashelp.class;
age_ly = age-1;
age_ny = age+1;
run;
Then the macro:
%macro do_count(data=, out=, prefix=, condition=, recode=, else=, var_start=);
%local dsid varcount varname rc; *declare local for safety;
%let dsid = %sysfunc(open(&data.,i)); *open the dataset;
%let varcount = %sysfunc(attrn(&dsid,nvars)); *get the count of variables to access;
data &out.; *now start the main data step;
set &data.; *set the original data set;
%do i = 1 %to &varcount; *iterate over the variables;
%let varname= %sysfunc(varname(&dsid.,&i.)); *determine the variable name;
%if %upcase(%substr(&varname.,1,%length(&var_start.))) = %upcase(&var_start.) %then %do; *if it matches your pattern then recode it;
&prefix.&varname. = ifn(&varname. &condition., &recode., &else.); *this uses IFN - only recodes numerics. More complicated code would work if this could be character.;
%end;
%end;
%let rc = %sysfunc(close(&dsid)); *clean up after yourself;
run;
%mend do_count;
%do_count(data=class, out=class_r, var_start=age, condition= > 14, recode=1, else=0, prefix=p_);
The expression (1/4=1) means values {1,2,3,4} should be recoded into
1.
Perhaps you do not need to make new variables at all? If have variables with values 1,2,3,4,5 and you want to treat them as if they have only two groups you could do it with a format.
First define your grouping using a format.
proc format ;
value newgrp 1-4='Group 1' 5='Group 2' ;
run;
Then you can just use a FORMAT statement in your analysis step to have SAS treat your five level variable as it if had only two levels.
proc freq ;
tables eq: ;
format eq: NEWGRP. ;
run;

MATLAB: Loop through the values of a list from 'who' function

I have a long list of variables in my workspace.
First, I'm finding the potential variables I could be interested in using the who function. Next, I'd like to loop through this list to find the size of each variable, however who outputs only the name of the variables as a string.
How could I use this list to refer to the values of the variables, rather than just the name?
Thank you,
list = who('*time*')
list =
'time'
'time_1'
'time_2'
for i = 1:size(list,1);
len(i,1) = length(list(i))
end
len =
1
1
1
If you want details about the variables, you can use whos instead which will return a struct that contains (among other things) the dimensions (size) and storage size (bytes).
As far as getting the value, you could use eval but this is not recommended and you should instead consider using cell arrays or structs with dynamic field names rather than dynamic variable names.
S = whos('*time*');
for k = 1:numel(S)
disp(S(k).name)
disp(S(k).bytes)
disp(S(k).size)
% The number of elements
len(k) = prod(S(k).size);
% You CAN get the value this way (not recommended)
value = eval(S(k).name);
end
#Suever nicely explained the straightforward way to get this information. As I noted in a comment, I suggest that you take a step back, and don't generate those dynamically named variables to begin with.
You can access structs dynamically, without having to resort to the slow and unsafe eval:
timestruc.field = time;
timestruc.('field1') = time_1;
fname = 'field2';
timestruc.(fname) = time_2;
The above three assignments are all valid for a struct, and so you can address the fields of a single data struct by generating the field strings dynamically. The only constraint is that field names have to be valid variable names, so the first character of the field has to be a letter.
But here's a quick way out of the trap you got yourself into: save your workspace (well, the relevant part) in a .mat file, and read it back in. You can do this in a way that will give you a struct with fields that are exactly your variable names:
time = 1;
time_1 = 2;
time_2 = rand(4);
save('tmp.mat','time*'); % or just save('tmp.mat')
S = load('tmp.mat');
afterwards S will be a struct, each field will correspond to a variable you saved into 'tmp.mat':
>> S
S =
time: 1
time_1: 2
time_2: [4x4 double]
An example writing variables from workspace to csv files:
clear;
% Writing variables of myfile.mat to csv files
load('myfile.mat');
allvars = who;
for i=1:length(allvars)
varname = strjoin(allvars(i));
evalstr = strcat('csvwrite(', char(39), varname, '.csv', char(39), ', ', varname, ')');
eval(evalstr);
end

SAS - Data Step equivalent of Proc SQL

What would be the data step equivalent of this proc sql?
proc sql;
create table issues2 as(
select request,
area,
sum(issue_count) as issue_count,
sum(resolved_count) as resolved_count
from
issues1
group by request, area
);
PROC MEANS/SUMMARY is better, but if it's relevant, the actual data step solution is as follows. Basically you just reset the counter to 0 on first.<var> and output on last.<var>, where <var> is the last variable in your by group.
Note: This assumes the data is sorted by request area. Sort it if it is not.
data issues2(rename=(issue_count_sum=issue_count resolved_count_sum=resolved_count) drop=issue_count resolved_count);
set issues1;
by request area;
if first.area then do;
issue_count_sum=0;
resolved_count_sum=0;
end;
issue_count_sum+issue_count;
resolved_count_sum+resolved_count;
if last.area then output;
run;
The functional equivalent of what you're trying to do is the following:
data _null_;
set issues1(rename=(issue_count=_issue_count
resolved_count=_resolved_count)) end=done;
if _n_=1 then do;
declare hash total_issues();
total_issues.defineKey("request", "area");
total_issues.defineData("request", "area", "issue_count", "resolved_count");
total_issues.defineDone();
end;
if total_issues.find() ne 0 then do;
issue_count = _issue_count;
resolved_count = _resolved_count;
end;
else do;
issue_count + _issue_count;
resolved_count + _resolved_count;
end;
total_issues.replace();
if done then total_issues.output(dataset: "issues2");
run;
This method does not require you to to pre-sort the dataset. I wanted to see what kind of performance you'd get with using different methods so I did a few tests on a 74M row dataset. I got the following run-times (your results may vary):
Unsorted Dataset:
Proc SQL - 12.18 Seconds
Data Step With Hash Object Method (above) - 26.68 Seconds
Proc Means using a class statement (nway) - 5.13 Seconds
Sorted Dataset (36.94 Seconds to do a proc sort):
Proc SQL - 10.82 Seconds
Proc Means using a by statement - 9.31 Seconds
Proc Means using a class statement (nway) - 6.07 Seconds
Data Step using by statement (I used the code from Joe's answer) - 8.97 Seconds
As you can see, I wouldn't recommend using the data step with the hash object method shown above since it took twice as long as the proc sql.
I'm not sure why proc means with a bystatement took longer then proc means with a class statement, but I tried this on a bunch of different datasets and saw similar differences in runtimes (I'm using SAS 9.3 on Linux 64).
Something to keep in mind is that these runtimes might be completely different for your situation but I would recommend using the the following code to do the summation:
proc means data=issues1 noprint nway;
class request area;
var issue_count resolved_count;
output out=issues2(drop=_:) sum=;
run;
Awkward, I think, to do it in a data step at all - summing and resetting variables at each level of the by variables would work. A hash object might also do the trick.
Perhaps the simplest non-Proc SQL method would be to use Proc Summary:-
proc summary data = issues1 nway missing;
class request area;
var issue_count resolved_count;
output out = issues2 sum(issue_count) = issue_count sum(resolved_count) = resolved_count ;
run;
Here's the temporary array method. This is the "simplest" of them, making some assumptions about the request and area values; if those assumptions are faulty, as they often are in real data, it may not be quite as easy as this. Note that while in the below the data does happen to be sorted, I don't rely on it being sorted and the process don't gain any advantage from it being sorted.
data issues1;
do request=1 to 1e5;
do area = 1 to 7;
do issueNum = 1 to 1e2;
issue_count = floor(rand('Uniform')*7);
resolved_count = floor(rand('Uniform')*issue_count);
output;
end;
end;
end;
run;
data issues2;
set issues1 end=done;
array ra_issue[1100000] _temporary_;
array ra_resolved[1100000] _temporary_;
*array index = request||area, so request 9549 area 6 = 95496.;
ra_issue[input(cats(request,area),best7.)] + issue_count;
ra_resolved[input(cats(request,area),best7.)] + resolved_count;
if done then do;
do _t = 1 to dim(ra_issue);
if not missing(ra_issue[_t]) then do;
request = floor(_t/10);
area = mod(_t,10);
issue_count=ra_issue[_t];
resolved_count=ra_resolved[_t];
output;
keep request area issue_count resolved_count;
end;
end;
end;
run;
That performed comparably to PROC MEANS with CLASS, given the simple data I started it with. If you can't trivially generate a key from a combination of area and request (if they're character variables, for example), you would have to store another array of name-to-key relationships which would make it quite a lot slower if there are a lot of combinations (though if there are relatively few combinations, it's not necessarily all that bad). If for some reason you were doing this in production, I would first create a table of unique request+area combinations, create a Format and an Informat to convert back and forth from a unique key (which should be very fast AND give you a reliable index), and then do this using that format/informat rather than the cats / division-modulus that I do here.

SAS: put format in macro

I am trying to create a new variable by assigning a format to an existing variable. I'm doing this from within a macro. I'm getting the following error: ": Expecting a format name." Any thoughts on how to resolve? Thanks!
/* macro to loop thru a list of vars and execute a code block on each. This is working fine. */
%macro iterlist
(
code =
,list =
)
;
%*** ASSIGN EACH ITEM IN THE LIST TO AN INDEXED MACRO VARIABLE &&ITEM&I ;
%let i = 1;
%do %while (%cmpres(%scan(&list., &i.)) ne );
%let item&i. = %cmpres(%scan(&list., &i.));
%let i = %eval((&i. + 1);
%end;
%*** STORE THE COUNT OF THE NUMBER OF ITEMS IN A MACRO VARIABLE: &CNTITEM;
%let cntitem = %eval((&i. - 1);
%*** EXPRESS CODE, REPLACING TOKENS WITH ELEMENTS OF THE LIST, IN SEQUENCE;
%do i = 1 %to &cntitem.;
%let codeprp = %qsysfunc(tranwrd(&code.,?,%nrstr(&&item&i..)));
%unquote(&codeprp.)
%end;
%mend iterlist;
/* set the list of variables to iterate thru */
%let mylist = v1 v2 v3 v4;
/* create a contents table to look up format info to assign in macro below*/
proc contents data=a.recode1 noprint out=contents;
run;
/* macro to create freq and chisq tables for each var */
%macro runfreqs (variabl = );
proc freq data=a.recode1 noprint ;
tables &variabl.*improved /out=&variabl._1 chisq;
output out=&variabl.chisq n pchi ;
run;
/* do some more stuff with the freq tables, then grab format for variable from contents */
data _null_;
set contents;
if name="&variabl." then CALL SYMPUT("classformat", format);
run;
data &variabl._3;
length classvalue $ 30 ;
set &variabl._2; ;
/* output a new var using the macro variable for format that we pulled from contents above. Here's where the error occurs. */
classvalue=put(class, %quote(&classformat.));
run;
%mend runfreqs;
* run the macro, iterating thru var list and creating freq tables;
%ITERLIST(list = &mylist., code = %nrstr(%runfreqs(variabl = ?);));
Just guessing, the line
classvalue=put(class, %quote(&classformat.));
should be
classvalue=put(class, &classformat..);
Two points because one is "eaten" by macro processor to mark end of macro variable name, the second one is needed to complete format name.
I believe you won't need %quote() in your case - format name cannot contain strings quoted by %quote().
EDIT: Again not tried, just based on the code I see you also need to change CALL SYMPUT("classformat", format);
to CALL SYMPUTX("classformat", format);
CALL SYMPUTX() is advanced version of CALL SYMPUT(), it removes trailing blanks in macro variable value while the original version keeps blanks. Effectively this will be same as your solution, just simpler.
So the problem is indeed with extra blanks between format name and the period.
No idea why this works and vasja's idea wouldn't, but the problem was clearly with the period on the end of the format name (or perhaps some extra white space?). I changed the data step to add the period before the SYMPUT call:
data _null_;
set contents;
myformat=catt(format,'.');
if name="&variabl." then CALL SYMPUT("classformat", myformat);
run;