Related
Can someone help me to understand how I can save in matlab a group of .csv files, select only the columns in which I am interested and get as output a final file in which I have the average value of the y columns and standard deviation of y axes? I am not so good in matlab and so I kindly ask if someone to help me to solve this question.
Here what I tried to do till now:
clear all;
clc;
which_column = 5;
dirstats = dir('*.csv');
col3Complete=0;
col4Complete=0;
for K = 1:length(dirstats)
[num,txt,raw] = xlsread(dirstats(K).name);
col3=num(:,3);
col4=num(:,4);
col3Complete=[col3Complete;col3];
col4Complete=[col4Complete;col4];
avgVal(K)=mean(col4(:));
end
col3Complete(1)=[];
col4Complete(1)=[];
%columnavg = mean(col4Complete);
%columnstd = std(col4Complete);
% xvals = 1 : size(columnavg,1);
% plot(xvals, columnavg, 'b-', xvals, columnavg-columnstd, 'r--', xvals, columnavg+columstd, 'r--');
B = reshape(col4Complete,[5000,K]);
m=mean(B,2);
C = reshape (col4Complete,[5000,K]);
S=std(C,0,2);
Now I know that I should compute mean and stdeviation inside for loop, using mean()function, but I am not sure how I can use it.
which_column = 5;
dirstats = dir('*.csv');
col3Complete=[]; % Initialise as empty matrix
col4Complete=[];
avgVal = zeros(length(dirstats),2); % initialise as columnvector
for K = 1:length(dirstats)
[num,txt,raw] = xlsread(dirstats(K).name);
col3=num(:,3);
col4=num(:,4);
col3Complete=[col3Complete;col3];
col4Complete=[col4Complete;col4];
avgVal(K,1)=mean(col4(:)); % 1st column contains mean
avgVal(K,2)=std(col4(:)); % 2nd column contains standard deviation
end
%columnavg = mean(col4Complete);
%columnstd = std(col4Complete);
% xvals = 1 : size(columnavg,1);
% plot(xvals, columnavg, 'b-', xvals, columnavg-columnstd, 'r--', xvals, columnavg+columstd, 'r--');
B = reshape(col4Complete,[5000,K]);
meanVals=mean(B,2);
I didn't change much, just initialised your arrays as empty arrays so you do not have to delete the first entry later on and made avgVal a column vector with the mean in column 1 and the standard deviation in column 1. You can of course add two columns if you want to collect those statistics for your 3rd column in the csv as well.
As a side note: xlsread is rather heavy for reading files, since Excel is horribly inefficient. If you want to read a structured file such as a csv, it's faster to use importdata.
Create some random matrix to store in a file with header:
A = rand(1e3,5);
out = fopen('output.csv','w');
fprintf(out,['ColumnA', '\t', 'ColumnB', '\t', 'ColumnC', '\t', 'ColumnD', '\t', 'ColumnE','\n']);
fclose(out);
dlmwrite('output.csv', A, 'delimiter','\t','-append');
Load it using csvread:
data = csvread('output.csv',1);
data now contains your five columns, without any headers.
My code is below. In the code, I am evaluating only the data in the 'fb2010' file. I want to add other files" 'fb2020', 'fb2030', and 'fb2040' and evaluate their data by the same code. My question is how to apply a for loop and include the other data files. I tried, but I got confused by the for loop.
load('fb2010'); % loading the data
x = fb2010(3:1:1502,:);
% y_filt = filter(b,a,x); % filtering the received signal
y_filt= filter(b,a,x,[],2);
%%%%%%% fourier transform
nfft = length(y_filt);
res = fft(y_filt,nfft,2)/nfft;
res2 = res(:,1:nfft/2+1); %%%% taking single sided spectrum
res3 = fft(res2,[],2);
for i = 3:1:1500 %%%% dividing each row by first row.
resd(i,:) = res3(i,:)./res3(1,:);
end
I'm assuming that your files are MAT-files, not ASCII. You can do this by having load return a struct and using dynamic field referencing:
n = 4;
for i = 1:n
vname = ['fb20' int2str(i) '0']; % Create file/variable name based on index
s = load(vname); % Load data as struct (overwriting previous s)
x = s.(vname)(3:1:1502,:); % Access struct with dynamic field reference
% Rest of your code
...
end
If you're using a plain ASCII file, load won't produce a struct. However, such files are much simpler (see documentation for load/save). The following code would probably work:
n = 4;
for i = 1:n
vname = ['fb20' int2str(i) '0']; % Create file/variable name based on index
s = load(vname); % Load data as matrix (overwriting previous s)
x = s(3:1:1502,:); % Directly index matrix
% Rest of your code
...
end
It would be a good idea to add the file extension to your load command to make your code more readable.
I have large .bin files (10GB-60GB) created by Labview software, the .bin files represent the output of two sensors used from experiments that I have done.
The problem I have is importing the data into Matlab, the only way I have achieved this so far is by converting the .bin files to .txt files in Labview software then Importing the data into MATLAB using the following code:
Nlines = 1e6; % set number of lines to sample per cycle
sample_rate = (1); %sample rate
DECE= 1000;% decimation factor
TIME = (0:sample_rate:sample_rate*((Nlines)-1));%first inctance of time vector
format = '%f\t%f';
fid = fopen('H:\PhD backup\Data/ONK_PP260_G_text.txt');
while(~feof(fid))
C = textscan(fid, format, Nlines, 'CollectOutput', true);
d = C{1}; % immediately clear C at this point you need the memory!
clearvars C ;
TIME = ((TIME(end)+sample_rate):sample_rate:(sample_rate*(size(d,1)))+(TIME(end)));%shift Time along
plot((TIME(1:DECE:end)),(d(1:DECE:end,:)))%plot and decimate
hold on;
clearvars d;
end
fclose(fid);
The basic idea behind my code is to conserve RAM by reading Nlines of data from .txt on disk to Matlab variable C in RAM, plotting C then clearing C. This process occurs in loop so the data is plotted in chunks until the end of the .txt file is reached.
I want to read the .bin file directly into MATLAB rather than converting it to .txt first because it takes hours for the conversion to complete and I have a lot of data. Here are some examples of my data but in manageable sizes:
https://www.dropbox.com/sh/rzut4zbrert9fm0/q9SiZYmrdG
Here is a description of the binary data:
http://forums.ni.com/t5/LabVIEW/Loading-Labview-Binary-Data-into-Matlab/td-p/1107587
Someone has all ready written a Matlab script to import Labveiw .bin files but their script will only work with very small files:
% LABVIEWLOAD Load Labview binary data into Matlab
%
% DATA = LABVIEWLOAD(FNAME,DIM); % Loads the Labview data in binary
% format from the file specified by FNAME, given the NUMBER of
% dimensions (not the actual dimensions of the data in the binary
% file) of dimensions specified by DIM.
%
% LABVIEWLOAD will repeatedly grab data until the end of the file.
% Labview arrays of the same number of dimensions can be repeatedly
% appended to the same binary file. Labview arrays of any dimensions
% can be read.
%
% DATA = LABVIEWLOAD(FNAME,DIM,PREC); % Loads the data with the specified
% precision, PREC.
%
% Note: This script assumes the default parameters were used in the
% Labview VI "Write to Binary File". Labview uses the Big Endian
% binary number format.
%
% Examples:
% D = labviewload('Data.bin',2); % Loads in Data.bin assuming it
% contains double precision data and two dimensions.
%
% D = labviewload('OthereData.bin',3,'int8'); % Loads in
% OtherData.bin assuming it contains 8 bit integer values or
% boolean values.
%
% Jeremiah Smith
% 4/8/10
% Last Edit: 5/6/10
function data = labviewload(fname,dim,varargin)
siz = [2^32 2^16 2^8 1]'; % Array dimension conversion table
% Precision Input
if nargin == 2
prec = 'double';
elseif nargin == 3
prec = varargin{1};
else
error('Too many inputs.')
end
%% Initialize Values
fid = fopen(fname,'r','ieee-be'); % Open for reading and set to big-endian binary format
fsize = dir(fname); % File information
fsize = fsize.bytes; % Files size in bytes
%% Preallocation
rows = [];
columns = [];
I = 0;
while fsize ~= ftell(fid)
dims = [];
for i=1:1:dim
temp = fread(fid,4);
temp = sum(siz.*temp);
dims = [dims,temp];
end
I = I + 1;
% fseek(fid,prod(dims)*8,'cof'); % Skip the actual data
temp = fread(fid,prod(dims),prec,0,'ieee-be'); % Skip the actual data (much faster for some reason)
end
fseek(fid,0,'bof'); % Reset the cursor
data = repmat({NaN*ones(dims)},I,1); % Preallocate space, assumes each section is the same
%% Load and parse data
for j=1:1:I
dims = []; % Actual array dimensions
for i=1:1:dim
temp = fread(fid,4);
temp = sum(siz.*temp);
dims = [dims,temp];
end
clear temp i
temp = fread(fid,prod(dims),prec,0,'ieee-be'); % 11 is the values per row,
% double is the data type, 0 is the bytes to skip, and
% ieee-be specified big endian binary data
%% Reshape the data into the correct array configuration
if dim == 1
temp = reshape(temp,1,dims);
else
evalfunc = 'temp = reshape(temp';
for i=1:1:dim
evalfunc = [evalfunc ',' num2str(dims(dim-i+1))];
end
if dim ~= 2
eval([evalfunc ');'])
else
eval([evalfunc ')'';'])
end
end
data{j} = temp; % Save the data
end
fclose(fid); % Close the file
The code has the following error message, when you try to process even relatively small .bin files:
Error using ones
Maximum variable size allowed by the program is exceeded.
Error in labviewload (line 65)
data = repmat({NaN*ones(dims)},I,1); % Preallocate space, assumes each section is the same
Can you help me modify the code so that I can open large .bin files? Any help will be much appreciated.
Cheers,
Jim
I'm new to Matlab.
I'm trying to apply PCA function(URL listed below)into my palm print recognition program to generate the eigenpalms. My palm print grey scale images dimension are 450*400.
Before using it, I was trying to study these codes and add some codes to save the eigenvector as .mat file. Some of the %comments added by me for my self understanding.
After a few days of studying, I still unable to get the answers.
I decided to ask for helps.I have a few questions to ask regarding this PCA.m.
PCA.m
What is the input of the "options" should be? of "PCA(data,details,options)"
(is it an integer for reduced dimension? I was trying to figure out where is the "options" value passing, but still unable to get the ans. The msgbox of "h & h2", is to check the codes run until where. I was trying to use integer of 10, but the PCA.m processed dimension are 400*400.)
The "eigvector" that I save as ".mat" file is ready to perform Euclidean distance classifier with other eigenvector? (I'm thinking that eigvector is equal to eigenpalm, like in face recognition, the eigen faces. I was trying to convert the eigenvector matrix back to image, but the image after PCA process is in Black and many dots on it)
mySVD.m
In this function, there are two values that can be changed, which are MAX_MATRIX_SIZE set by 1600 and EIGVECTOR_RATIO set by 0.1%. May I know these values will affect the results? ( I was trying to play around with the values, but I cant see the different. My palm print image dimension is set by 450*400, so the Max_matrix_size should set at 180,000?)
** I hope you guys able to understand what I'm asking, please help, Thanks guys (=
Original Version : http://www.cad.zju.edu.cn/home/dengcai/Data/code/PCA.m
mySVD: http://www.cad.zju.edu.cn/home/dengcai/Data/code/mySVD.m
% Edited Version by me
function [eigvector, eigvalue] = PCA(data,details,options)
%PCA Principal Component Analysis
%
% Usage:
% [eigvector, eigvalue] = PCA(data, options)
% [eigvector, eigvalue] = PCA(data)
%
% Input:
% data - Data matrix. Each row vector of fea is a data point.
% fea = finite element analysis ?????
% options.ReducedDim - The dimensionality of the reduced subspace. If 0,
% all the dimensions will be kept.
% Default is 0.
%
% Output:
% eigvector - Each column is an embedding function, for a new
% data point (row vector) x, y = x*eigvector
% will be the embedding result of x.
% eigvalue - The sorted eigvalue of PCA eigen-problem.
%
% Examples:
% fea = rand(7,10);
% options=[]; %store an empty matrix in options
% options.ReducedDim=4;
% [eigvector,eigvalue] = PCA(fea,4);
% Y = fea*eigvector;
%
% version 3.0 --Dec/2011
% version 2.2 --Feb/2009
% version 2.1 --June/2007
% version 2.0 --May/2007
% version 1.1 --Feb/2006
% version 1.0 --April/2004
%
% Written by Deng Cai (dengcai AT gmail.com)
%
if (~exist('options','var'))
%A = exist('name','kind')
% var = Checks only for variables.
%http://www.mathworks.com/help/matlab/matlab_prog/symbol-reference.html#bsv2dx9-1
%The tilde "~" character is used in comparing arrays for unequal values,
%finding the logical NOT of an array,
%and as a placeholder for an input or output argument you want to omit from a function call.
options = [];
end
h2 = msgbox('not yet');
ReducedDim = 0;
if isfield(options,'ReducedDim')
%tf = isfield(S, 'fieldname')
h2 = msgbox('checked');
ReducedDim = options.ReducedDim;
end
[nSmp,nFea] = size(data);
if (ReducedDim > nFea) || (ReducedDim <=0)
ReducedDim = nFea;
end
if issparse(data)
data = full(data);
end
sampleMean = mean(data,1);
data = (data - repmat(sampleMean,nSmp,1));
[eigvector, eigvalue] = mySVD(data',ReducedDim);
eigvalue = full(diag(eigvalue)).^2;
if isfield(options,'PCARatio')
sumEig = sum(eigvalue);
sumEig = sumEig*options.PCARatio;
sumNow = 0;
for idx = 1:length(eigvalue)
sumNow = sumNow + eigvalue(idx);
if sumNow >= sumEig
break;
end
end
eigvector = eigvector(:,1:idx);
end
%dt get from C# program, user ID and name
evFolder = 'ev\';
userIDName = details; %get ID and Name
userIDNameWE = strcat(userIDName,'\');%get ID and Name with extension
filePath = fullfile('C:\Users\***\Desktop\Data Collection\');
userIDNameFolder = strcat(filePath,userIDNameWE); %ID and Name folder
userIDNameEVFolder = strcat(userIDNameFolder,evFolder);%EV folder in ID and Name Folder
userIDNameEVFile = strcat(userIDNameEVFolder,userIDName); % EV file with ID and Name
if ~exist(userIDNameEVFolder, 'dir')
mkdir(userIDNameEVFolder);
end
newFile = strcat(userIDNameEVFile,'_1');
searchMat = strcat(newFile,'.mat');
if exist(searchMat, 'file')
filePattern = strcat(userIDNameEVFile,'_');
D = dir([userIDNameEVFolder, '*.mat']);
Num = length(D(not([D.isdir])))
Num=Num+1;
fileName = [filePattern,num2str(Num)];
save(fileName,'eigvector');
else
newFile = strcat(userIDNameEVFile,'_1');
save(newFile,'eigvector');
end
You pass options in a structure, for instance:
options.ReducedDim = 2;
or
options.PCARatio =0.4;
The option ReducedDim selects the number of dimensions you want to use to represent the final projection of the original matrix. For instance if you pick option.ReducedDim = 2 you use only the two eigenvectors with largest eigenvalues (the two principal components) to represent your data (in effect the PCA will return the two eigenvectors with largest eigenvalues).
PCARatio instead allows you to pick the number of dimensions as the first eigenvectors with largest eigenvalues that account for fraction PCARatio of the total sum of eigenvalues.
In mySVD.m, I would not increase the default values unless you expect more than 1600 eigenvectors to be necessary to describe your dataset. I think you can safely leave the default values.
I've googled and searched through Matlab Central, but cannot find any way to open DBF files directly in Matlab. There are some references to DBFREAD function in TMW File Exchange, but it's not available anymore. Is it really a problem?
I do have Database toolbox, but could not find dbf support there.
I don't want to use Excel or other tools for converting files outside of Matlab, since I have a lot of files to process. ODBC also is not good, I need the code to work under Mac and Unix as well.
Please help.
I contacted with Brian Madsen, the author of DBFREAD function, which was deleted from File Exchange, probably because The Mathworks is going to include this function into MATLAB in some future release. Brian kindly gave me permission to publish this function here. All copyright information left intact. I only modified lines 33-38 to allow DBFREAD to read file outside working directory.
function [dbfData, dbfFieldNames] = dbfread(filename, records2read, requestedFieldNames)
%DBFREAD Read the specified records and fields from a DBF file.
%
% [DATA, NAMES] = DBFREAD(FILE) reads numeric, float, character and date
% data and field names from a DBF file, FILE.
%
% [DATA, NAMES] = DBFREAD(FILE, RECORDS2READ) reads only the record
% numbers specified in RECORDS2READ, a scalar or vector.
%
% [DATA, NAMES] = DBFREAD(FILE, RECORDS2READ, REQUESTEDFIELDNAMES) reads
% the data from the fields, REQUESTEDFIELDNAMES, for the specified
% records. REQUESTEDFIELDNAMES must be a cell array. The fields in the
% output will follow the order given in REQUESTEDFIELDNAMES.
%
% Examples:
%
% % Get all records and a list of the field names from a DBF file.
% [DATA,NAMES] = dbfread('c:\matlab\work\mydbf')
%
% % Get data from records 3:5 and 10 from a DBF file.
% DATA = dbfread('c:\matlab\work\mydbf',[3:5,10])
%
% % Get data from records 1:10 for three of the fields in a DBF file.
% DATA = dbfread('c:\matlab\work\mydbf',1:10,{'FIELD1' 'FIELD3' 'FIELD5'})
%
% See also XLSREAD, DLMREAD, DLMWRITE, LOAD, FILEFORMATS, TEXTSCAN.
% Copyright 2008 The MathWorks, Inc.
% $Revision: 1.0 $ $Date: 2008/04/18 05:58:17 $
[pathstr,name,ext] = fileparts(filename);
dbfFileId = fopen(filename,'r','ieee-le');
if (dbfFileId == -1)
dbfFileId = fopen(fullfile(pathstr, [name '.dbf']),'r','ieee-le');
end
if (dbfFileId == -1)
dbfFileId = fopen(fullfile(pathstr, [name '.DBF']),'r','ieee-le');
end
if (dbfFileId == -1)
eid = sprintf('MATLAB:%s:missingDBF', mfilename);
msg = sprintf('Failed to open file %s.dbf or file %s.DBF.',...
name, name);
error(eid,'%s',msg)
end
info = dbfinfo(dbfFileId);
if ~exist('requestedFieldNames','var')
dbfFieldNames = {info.FieldInfo.Name};
requestedFieldNames = dbfFieldNames;
else
dbfFieldNames = (info.FieldInfo(matchFieldNames(info,requestedFieldNames)).Name);
end
fields2read = matchFieldNames(info,requestedFieldNames);
% The first byte in each record is a deletion indicator
lengthOfDeletionIndicator = 1;
if ~exist('records2read','var')
records2read = (1:info.NumRecords);
elseif max(records2read) > info.NumRecords
eid = sprintf('MATLAB:%s:invalidRecordNumber', mfilename);
msg = sprintf('Record number %d does not exist, please select from the range 1:%d.',...
max(records2read), info.NumRecords);
error(eid,'%s',msg)
end
% Loop over the requested fields, reading in the data
dbfData = cell(numel(records2read),numel(fields2read));
for k = 1:numel(fields2read),
n = fields2read(k);
fieldOffset = info.HeaderLength ...
+ sum([info.FieldInfo(1:(n-1)).Length]) ...
+ lengthOfDeletionIndicator;
fseek(dbfFileId,fieldOffset,'bof');
formatString = sprintf('%d*uint8=>char',info.FieldInfo(n).Length);
skip = info.RecordLength - info.FieldInfo(n).Length;
data = fread(dbfFileId,[info.FieldInfo(n).Length info.NumRecords],formatString,skip);
dbfData(:,k) = feval(info.FieldInfo(n).ConvFunc,(data(:,records2read)'));
% dbfData(:,k) = info.FieldInfo(n).ConvFunc(data(:,records2read)');
end
fclose(dbfFileId);
%--------------------------------------------------------------------------
function fields2read = matchFieldNames(info, requestedFieldNames)
% Determine which fields to read.
allFieldNames = {info.FieldInfo.Name};
if isempty(requestedFieldNames)
if ~iscell(requestedFieldNames)
% Default case: User omitted the parameter, return all fields.
fields2read = 1:info.NumFields;
else
% User supplied '{}', skip all fields.
fields2read = [];
end
else
% Match up field names to see which to return.
fields2read = [];
for k = 1:numel(requestedFieldNames)
index = strmatch(requestedFieldNames{k},allFieldNames,'exact');
if isempty(index)
wid = sprintf('MATLAB:%s:nonexistentDBFName',mfilename);
wrn = sprintf('DBF name ''%s'' %s\n%s',requestedFieldNames{k},...
'doesn''t match an existing DBF name.',...
' It will be ignored.');
warning(wid,wrn)
end
for l = 1:numel(index)
% Take them all in case of duplicate names.
fields2read(end+1) = index(l);
end
end
end
%--------------------------------------------------------------------------
function info = dbfinfo(fid)
%DBFINFO Read header information from DBF file.
% FID File identifier for an open DBF file.
% INFO is a structure with the following fields:
% Filename Char array containing the name of the file that was read
% DBFVersion Number specifying the file format version
% FileModDate A string containing the modification date of the file
% NumRecords A number specifying the number of records in the table
% NumFields A number specifying the number of fields in the table
% FieldInfo A 1-by-numFields structure array with fields:
% Name A string containing the field name
% Type A string containing the field type
% ConvFunc A function handle to convert from DBF to MATLAB type
% Length A number of bytes in the field
% HeaderLength A number specifying length of the file header in bytes
% RecordLength A number specifying length of each record in bytes
% Copyright 1996-2005 The MathWorks, Inc.
% $Revision: 1.1.10.4 $ $Date: 2005/11/15 01:07:13 $
[version, date, numRecords, headerLength, recordLength] = readFileInfo(fid);
fieldInfo = getFieldInfo(fid);
info.Filename = fopen(fid);
info.DBFVersion = version;
info.FileModDate = date;
info.NumRecords = numRecords;
info.NumFields = length(fieldInfo);
info.FieldInfo = fieldInfo;
info.HeaderLength = headerLength;
info.RecordLength = recordLength;
%----------------------------------------------------------------------------
function [version, date, numRecords, headerLength, recordLength] = readFileInfo(fid)
% Read from File Header.
fseek(fid,0,'bof');
version = fread(fid,1,'uint8');
year = fread(fid,1,'uint8') + 1900;
month = fread(fid,1,'uint8');
day = fread(fid,1,'uint8');
dateVector = datevec(sprintf('%d/%d/%d',month,day,year));
dateForm = 1;% dd-mmm-yyyy
date = datestr(dateVector,dateForm);
numRecords = fread(fid,1,'uint32');
headerLength = fread(fid,1,'uint16');
recordLength = fread(fid,1,'uint16');
%----------------------------------------------------------------------------
function fieldInfo = getFieldInfo(fid)
% Form FieldInfo by reading Field Descriptor Array.
%
% FieldInfo is a 1-by-numFields structure array with the following fields:
% Name A string containing the field name
% Type A string containing the field type
% ConvFunc A function handle to convert from DBF to MATLAB type
% Length A number equal to the length of the field in bytes
lengthOfLeadingBlock = 32;
lengthOfDescriptorBlock = 32;
lengthOfTerminator = 1;
fieldNameOffset = 16; % Within table field descriptor
fieldNameLength = 11;
% Get number of fields.
fseek(fid,8,'bof');
headerLength = fread(fid,1,'uint16');
numFields = (headerLength - lengthOfLeadingBlock - lengthOfTerminator)...
/ lengthOfDescriptorBlock;
% Read field lengths.
fseek(fid,lengthOfLeadingBlock + fieldNameOffset,'bof');
lengths = fread(fid,[1 numFields],'uint8',lengthOfDescriptorBlock - 1);
% Read the field names.
fseek(fid,lengthOfLeadingBlock,'bof');
data = fread(fid,[fieldNameLength numFields],...
sprintf('%d*uint8=>char',fieldNameLength),...
lengthOfDescriptorBlock - fieldNameLength);
data(data == 0) = ' '; % Replace nulls with blanks
names = cellstr(data')';
% Read field types.
fseek(fid,lengthOfLeadingBlock + fieldNameLength,'bof');
dbftypes = fread(fid,[numFields 1],'uint8=>char',lengthOfDescriptorBlock - 1);
% Convert DBF field types to MATLAB types.
typeConv = dbftype2matlab(upper(dbftypes));
% Return a struct array.
fieldInfo = cell2struct(...
[names; {typeConv.MATLABType}; {typeConv.ConvFunc}; num2cell(lengths)],...
{'Name', 'Type', 'ConvFunc', 'Length'},1)';
%----------------------------------------------------------------------------
function typeConv = dbftype2matlab(dbftypes)
% Construct struct array with MATLAB types & conversion function handles.
typeLUT = ...
{'N', 'double', #str2double2cell;... % DBF numeric
'F', 'double', #str2double2cell;... % DBF float
'C', 'char', #cellstr;... % DBF character
'D', 'char', #cellstr}; % DBF date
unsupported = struct('MATLABType', 'unsupported', ...
'ConvFunc', #cellstr);
% Unsupported types: Logical,Memo,N/ANameVariable,Binary,General,Picture
numFields = length(dbftypes);
if numFields ~= 0
typeConv(numFields) = struct('MATLABType',[],'ConvFunc',[]);
end
for k = 1:numFields
idx = strmatch(dbftypes(k),typeLUT(:,1));
if ~isempty(idx)
typeConv(k).MATLABType = typeLUT{idx,2};
typeConv(k).ConvFunc = typeLUT{idx,3};
else
typeConv(k) = unsupported;
end
end
%----------------------------------------------------------------------------
function out = str2double2cell(in)
% Translate IN, an M-by-N array of class char, to an M-by-1 column vector
% OUT, of class double. IN may be blank- or null-padded. If IN(k,:) does
% not represent a valid scalar value, then OUT(k) has value NaN.
if isempty(in)
out = {[NaN]};
return
end
% Use sprintf when possible, but fall back to str2double for unusual cases.
fmt = sprintf('%%%df',size(in,2));
[data count] = sscanf(reshape(in',[1 numel(in)]),fmt);
if count == size(in,1)
out = cell(count,1);
for k = 1:count
out{k} = data(k);
end
else
out = num2cell(str2double(cellstr(in)));
end
UPDATE
STR2DOUBLE2CELL subfunction sometimes works incorrectly if number of digits in the input parameter is different (see this discussion).
Here is my version of STR2DOUBLE2CELL:
function out = str2double2cell(in)
% Translate IN, an M-by-N array of class char, to an M-by-1 column vector
% OUT, of class double. IN may be blank- or null-padded. If IN(k,:) does
% not represent a valid scalar value, then OUT(k) has value NaN.
if isempty(in)
out = {[NaN]};
return
end
out = cellfun(#str2double,cellstr(in),'UniformOutput',false);
The way I see it, you have two options:
Method 1: use ODBC to read the dBASE files:
This requires the database toolbox
cd 'path/to/dbf/files/'
conn = database('dBASE Files', '', '');
cur = exec(conn, 'select * from table');
res = fetch(cur);
res.Data
close(conn)
'dBASE Files' is an ODBC Data Source Name (DSN) (I believe its installed by default with MS Office). It uses the current directory to look for .dbf files.
Or maybe you can use a DSN-less connection string with something like:
driver = 'sun.jdbc.odbc.JdbcOdbcDriver';
url = 'jdbc:odbc:DRIVER={Microsoft dBase Driver (*.dbf)};DBQ=x:\path;DefaultDir=x:\path';
conn = database('DB', '', '', driver, url);
...
if this gives you trouble, try using the FoxPro ODBC Driver instead..
For Linux/Unix, the same thing could be done. A quick search revealed these:
How do set up an ODBC DSN on Mac or Linux/Unix
how to create and use dBase-format files with OpenOffice
Method 2: read/write .dbf files directly with a library
There's a Java library available for free from SVConsulting, that allows you to read/write DBF files: JDBF. (UPDATE: link seems to be dead, use Wayback Machine to access it)
You can use Java classes directly from MATLAB. Refer to the documentation to see how.
If you are interested only in numerical values, try xlsread command.