I want to import a certain variables of an excel more than 40.
Would it be possible to import only the variables that I want in a step data with infile?
Thanks.
You can't use infile (without a lot of extra work, anyway) to read an excel file.
You can, however, embed keep statements in your proc import!
proc import file="path\to\myfile.xlsx"
out =mylib.mydata(keep=myvar)
dbms=excel replace;
run;
Of course, libname access allows the same keep statements.
Related
I have files with extension .fid
I want to read the data off of the file with matlab preferably. Is there anyway to do this with a custom file format like this?
If not, is there any other way I can transform this custom file format into something else? such as .csv?
Not familiar with this file type, but if it's human-readable, if you can open it in text editor and see the data, then there shouldn't be a problem. You can textscan to import data, if it's properly formatted, or fileread to import the entire file as string. You can even use uiimport to import data in interactive mode. Check MATLAB documentation for Data import and export
I am trying to import .csv files in SAS and they include dates and times.
In csv files, they are defined as "m-d-yyyy hh:mm:ss" (I am not allowed to change the data in excel but I have to work just on SAS).
The problem is that when SAS reads it, it thinks m is a and d is m. :(
For example, what is "9-1-2016 8:00:57" in excel is converted as 09JAN16:08:00:57 in SAS.
I want formats like 01SEP16:08:00:57.
How can i accurately import dates from .csv files in SAS?
Thanks.
I see that you've solved this by using R instead. In case anyone wants an answer for this in future:
One thing you can try is to change your options so that SAS reads as MDY instead of DMY. The way to do this is to run:
options datestyle=mdy;
When importing an excel file into SAS, I find that the import is not being done properly due to wrong variables'format.
The table trying to import look like this:
ID Barcode
1 56798274
2 56890263
3 60998217
4 SKU89731
...
The code I am using is the following:
PROC IMPORT OUT= WORK.test
DATAFILE= "D:\Barcode.xlsx"
DBMS=EXCEL REPLACE;
RANGE="Sheet1$";
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
What happens is that column "Barcode" has best12. format and therefore cases as ID=4 get a missing value ('.') because they originally had both characters and numbers.
Since it is not possible to change the format of a variable in a proc step how can I import the file correctly, and only using SAS editor?
EDIT:
Another option that does half the work and it might give some inspiration is dynamically changing the format of the variable by importing through a data step:
libname excelxls Excel "D:\Barcode.xlsx";
data want;
set excelxls.'Sheet1$'n (DBSASTYPE=(Barcode='CHAR(255)'));
run;
With the above code I force SAS to import the variable in the format I want (char), but still missings are generated for values as in ID=4.
I think your problem is that you have mixed=NO. Change this to mixed=YES and SAS will check a sample of the observations to see if there are any non-numeric characters in the variables - if it finds one then it will specify the variable as character.
Take a look at the link here for more information:
You could convert to a csv (or maybe xls?) file and either:
use the guessingrows= option to increase the number of rows SAS uses to determine type.
If you want complete control of the import: copy the data step code that proc import puts in the log and paste into your program. Now you can modify it to read the data exactly as you want.
I have a folder with various flat files. There will be new files added every month and I need to import this raw data using an automated job. I have managed everything except for the final little piece.
Here's my logic:
1) I Scan the folder and get all the file names that fit a certain description
2) I store all these file names and Routes in a Dataset
3) A macro has been created to check whether the file has been imported already. If it has, nothing will happen. If it has not yet been imported, it will be imported.
The final part that I need to get right, is I need to loop through all the records in the dataset created in step 2 and execute the macro from step 3 against all file names.
What is the best way to do this?
Look into call execute for executing a macro from a data step.
The method I most often use, is to write the macro statements to a file and use %include to submit it. I guess call execute as Reeza suggested is better, but I feel more in control when I do it like this:
filename s temp;
data _null_;
set table;
file s;
put '%macrocall(' variable ');';
run;
%inc s;
I am trying to import multiple excel files using the code below. There is a column in each excel file that has both numeric and text values but proc import is only importing numeric values, and put the text values as blank ('.').
Can anyone help with me this issue? Thanks much.
%let subdir=S:\Temp\;
filename dir "&subdir.*.xls";
data new;
length filename fname $ 32767;
infile dir eof=last filename=fname;
input ;
last: filename=fname;
run;
proc sort data=new nodupkey;
by filename;
run;
data null;
set new end=last;
call symputx(cats('filename',_n_),filename);
call symputx(cats('dsn',_n_),scan(scan(filename,7,'\'),1,'.'));
if last then call symputx('nobs',_n_);
run;
%put &nobs;
%macro import;
%do i=1 %to &nobs;
proc import datafile="&&filename&i" out=&&dsn&i
dbms=excel replace;
sheet = "Sheet1";
getnames=yes;
mixed=yes;
run;
%end;
%mend import;
%import
The best way to control the data types in an imported Excel work book is to use the DBSASTYPE data set option with a libname. This is especially useful when dealing with other data types (like datetime and time values).
For example, let's assume that the affected column is named MY_VAR and should always be read as character with a max length of 30. And let's also assume you have a spreadsheet column named START_TIME that contains an Excel coded date and time stamp. Your macro might be revised like this:
libname x "&&filename&i";
data &&dsn&i;
set x.'Sheet1$'n(dbsastype=(MY_VAR=char30 START_TIME=datetime));
run;
libname x clear;
As long as you know the name of the Excel column causing the problem, this should work well.
Mixed=Yes should fix things for you, but if it's not, then there are a few solutions.
First off, you may want to check your scan value. You can see one possible location here:
http://support.sas.com/kb/35/563.html
HKEY_LOCAL_MACHINE ► Software ► Microsoft ► Office ► 12.0 ► Access Connectivity Engine ► Engines
If you have an older version of office (pre-2007), it's called the "JET Engine" and is located in a slightly different place (you can google for it). Your "12.0" may be different depending on what you have installed (12.0 is Office 2007).
Second, you can force columns to be particular types. DBSASTYPE option is where you need to go for this; see http://www2.sas.com/proceedings/sugi31/020-31.pdf for example (about in the middle of the document, search for DBSASTYPE).