Proc Import. Importing Character values as "Blank" from excel files - import

I am trying to import multiple excel files using the code below. There is a column in each excel file that has both numeric and text values but proc import is only importing numeric values, and put the text values as blank ('.').
Can anyone help with me this issue? Thanks much.
%let subdir=S:\Temp\;
filename dir "&subdir.*.xls";
data new;
length filename fname $ 32767;
infile dir eof=last filename=fname;
input ;
last: filename=fname;
run;
proc sort data=new nodupkey;
by filename;
run;
data null;
set new end=last;
call symputx(cats('filename',_n_),filename);
call symputx(cats('dsn',_n_),scan(scan(filename,7,'\'),1,'.'));
if last then call symputx('nobs',_n_);
run;
%put &nobs;
%macro import;
%do i=1 %to &nobs;
proc import datafile="&&filename&i" out=&&dsn&i
dbms=excel replace;
sheet = "Sheet1";
getnames=yes;
mixed=yes;
run;
%end;
%mend import;
%import

The best way to control the data types in an imported Excel work book is to use the DBSASTYPE data set option with a libname. This is especially useful when dealing with other data types (like datetime and time values).
For example, let's assume that the affected column is named MY_VAR and should always be read as character with a max length of 30. And let's also assume you have a spreadsheet column named START_TIME that contains an Excel coded date and time stamp. Your macro might be revised like this:
libname x "&&filename&i";
data &&dsn&i;
set x.'Sheet1$'n(dbsastype=(MY_VAR=char30 START_TIME=datetime));
run;
libname x clear;
As long as you know the name of the Excel column causing the problem, this should work well.

Mixed=Yes should fix things for you, but if it's not, then there are a few solutions.
First off, you may want to check your scan value. You can see one possible location here:
http://support.sas.com/kb/35/563.html
HKEY_LOCAL_MACHINE ► Software ► Microsoft ► Office ► 12.0 ► Access Connectivity Engine ► Engines
If you have an older version of office (pre-2007), it's called the "JET Engine" and is located in a slightly different place (you can google for it). Your "12.0" may be different depending on what you have installed (12.0 is Office 2007).
Second, you can force columns to be particular types. DBSASTYPE option is where you need to go for this; see http://www2.sas.com/proceedings/sugi31/020-31.pdf for example (about in the middle of the document, search for DBSASTYPE).

Related

Esttab: Append rtf files with page break?

I use a loop to append each regression table for various dependent variables into one file:
global all_var var1 var2 var3
foreach var of global all_var {
capture noisily : eststo mod0: reg `var' i.female
capture noisily : eststo mod1: reg `var' i.female
capture noisily : eststo mod2: reg `var' i.female
esttab mod0 mod1 mod2 using "file_name.rtf", append
}
However, in the final rtf file some tables are stretching over two pages which does not look good.
Is there any way to avoid that, e.g. introduce some sort of pagebreak?
The community-contributed package rtfutil provides a solution:
net describe rtfutil, from(http://fmwww.bc.edu/RePEc/bocode/r)
TITLE
'RTFUTIL': module to provide utilities for writing Rich Text Format (RTF) files
DESCRIPTION/AUTHOR(S)
The rtfutil package is a suite of file handling utilities for
producing Rich Text Format (RTF) files in Stata, possibly
containing plots and tables. These RTF files can then be opened
by Microsoft Word, and possibly by alternative free word
processors. The plots can be included by inserting, as linked
objects, graphics files that might be produced by the graph
export command in Stata. The tables can be included by using the
listtex command, downloadable from SSC, with the handle() option.
Exact syntax will depend on your specific use case for which you do not provide any example data.
After installing rtfutil, you may use rtfappend. Suppose you want a page break between mod1 and mod2.
esttab mod0 mod1 using "file_name.rtf", replace
tempname handle
rtfappend `handle' using "file_name.rtf", replace
file write `handle' "\page" _n
rtfclose `handle'
esttab mod2 using "file_name.rtf", append
If you want a line break, just replace \page with \line.

(sas) concatenate multiple files from different folders

I'm a relatively new SAS user, so please bear with me!
I have 63 folders that each contain a uniquely named xls file, all containing the same variables in the same order. I need to concatenate them into a single file. I would post some of the code I've tried but, trust me, it's all gone horribly awry and is totally useless. Below is the basic library structure in a libname statement, though:
`libname JC 'W:\JCs\JC Analyses 2016-2017\JC Data 2016-2017\2 - Received from JCs\&jcname.\2016_&jcname..xls`
(there are 63 unique &jcname values)
Any ideas?
Thanks in advance!!!
This is a common requirement, but it requires a fairly uncommon knowledge of multiple SAS functions to execute well.
I like to approach this problem with a two step solution:
Get a list of filenames
Process each filename in a loop
While you can process each filename as you read it, it's a lot easier to debug and maintain code that separates these steps.
Step 1: Read filenames
I think the best way to get a list of filenames is to use dread() to read
directory entries into a dataset as follows:
filename myfiles 'c:\myfolder';
data filenames (keep=filename);
dir = dopen('myfiles');
do file = 1 to dnum(dir);
filename = dread(dir,file);
output;
end;
rc = dclose(dir);
run;
After this step you can verify that the correct filenames have been read be printing the dataset. You could also modify the code to only output certain types of files. I leave this as an exercise for the reader.
Step 2: use the files
Given a list of names in a dataset, I prefer to use call execute() inside a data step to process each file.
data _null_;
set filenames;
call execute('%import('||filename||')');
run;
I haven't included a macro to read in the Excel files and concatenate the dataset (partly because I don't have a suitable list of Excel files to test, but also because it's a situational problem). The stub macro below just outputs the filenames to the log, to verify that it's running:
%macro import(filename);
/* This is a dummy macro. Here is where you would do something with the file */
%put &filename;
%mend;
Notes:
Arguably there are many are many examples of how to do this in multiple places on the web, e.g.:
this SAS knowledge base article (http://support.sas.com/kb/41/880.html)
or this paper from SUGI,
However, most of them rely on the use of pipe to run a dir or ls command, which I feel is the wrong approach because it's platform dependent and in many modern environments the ability to pipe shell commands will be disabled.
I based this on an answer by Daniel Santos in communities.sas.com, but, given the superior functionality of stackoverflow I'd much rather see a good answer here.

SAS import excel determinated variables

I want to import a certain variables of an excel more than 40.
Would it be possible to import only the variables that I want in a step data with infile?
Thanks.
You can't use infile (without a lot of extra work, anyway) to read an excel file.
You can, however, embed keep statements in your proc import!
proc import file="path\to\myfile.xlsx"
out =mylib.mydata(keep=myvar)
dbms=excel replace;
run;
Of course, libname access allows the same keep statements.

Variables format in SAS proc import

When importing an excel file into SAS, I find that the import is not being done properly due to wrong variables'format.
The table trying to import look like this:
ID Barcode
1 56798274
2 56890263
3 60998217
4 SKU89731
...
The code I am using is the following:
PROC IMPORT OUT= WORK.test
DATAFILE= "D:\Barcode.xlsx"
DBMS=EXCEL REPLACE;
RANGE="Sheet1$";
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
What happens is that column "Barcode" has best12. format and therefore cases as ID=4 get a missing value ('.') because they originally had both characters and numbers.
Since it is not possible to change the format of a variable in a proc step how can I import the file correctly, and only using SAS editor?
EDIT:
Another option that does half the work and it might give some inspiration is dynamically changing the format of the variable by importing through a data step:
libname excelxls Excel "D:\Barcode.xlsx";
data want;
set excelxls.'Sheet1$'n (DBSASTYPE=(Barcode='CHAR(255)'));
run;
With the above code I force SAS to import the variable in the format I want (char), but still missings are generated for values as in ID=4.
I think your problem is that you have mixed=NO. Change this to mixed=YES and SAS will check a sample of the observations to see if there are any non-numeric characters in the variables - if it finds one then it will specify the variable as character.
Take a look at the link here for more information:
You could convert to a csv (or maybe xls?) file and either:
use the guessingrows= option to increase the number of rows SAS uses to determine type.
If you want complete control of the import: copy the data step code that proc import puts in the log and paste into your program. Now you can modify it to read the data exactly as you want.

Importing Flat file dynamically using %MACRO and dataset values SAS

I have a folder with various flat files. There will be new files added every month and I need to import this raw data using an automated job. I have managed everything except for the final little piece.
Here's my logic:
1) I Scan the folder and get all the file names that fit a certain description
2) I store all these file names and Routes in a Dataset
3) A macro has been created to check whether the file has been imported already. If it has, nothing will happen. If it has not yet been imported, it will be imported.
The final part that I need to get right, is I need to loop through all the records in the dataset created in step 2 and execute the macro from step 3 against all file names.
What is the best way to do this?
Look into call execute for executing a macro from a data step.
The method I most often use, is to write the macro statements to a file and use %include to submit it. I guess call execute as Reeza suggested is better, but I feel more in control when I do it like this:
filename s temp;
data _null_;
set table;
file s;
put '%macrocall(' variable ');';
run;
%inc s;