Sequential import of datafiles according to rule in Matlab - matlab

I have a list of .txt datafiles to import. Suppose they are called like that
file100data.txt file101data.txt ... file109data.txt I want to import them all using readtable.
I tried using the for to specify a vector a = [0:9] through which matlab could loop the readtable command but I cannot make it work.
for a = [0:9]
T_a_ = readtable('file10_a_data.txt')
end
I know I cannot just put _a_ where I want the vector to loop through, so my question is how can I actually do it?
Thank you in advance!

Here is a solution that should work even if you have missing files in your folder (e.g. you have file100data.txt to file107data.txt, but you are missing file file108data.txt and file109data.txt):
files=dir('file10*data.txt'); %list all data files in your folder
nof=size(files,1); %number of files
for i=1:nof %loop over the number of files
table_index=files(i).name(7) %recover table index from data filename
eval(sprintf('T%s = readtable(files(i).name)', table_index)); %read table
end
Now, please note that is it generally regarded as poor practice to dynamically name variables in Matlab (see this post for example). You may want to resort to structures or cells to store your data.

You need to convert the value of a into a string and combine strings together, like this:
Tables = struct();
for a = 0:9
% note: using dynamic structure field names to store the imported tables
fname = ['file10_' num2str(a) '_data'];
Tables.(fname) = readtable([fname '.txt']);
end

Related

How to perform processinng on .csv files used by my code for each person in one go without defining functions for each person in Matlab?

I have a main code named as process.m in which I define the path to 4 different .csv files for calculating values for each person. If I have a list of 30 persons and I don't want to define process.m as a function for each person, how can I do the processing for all the persons in one go. I want some idea by which process.m itself picks files for one person, then generate the results, then move to other person, pick his .csv files, generate the result and so on.
A breif outline of my code is attached here that would project the problem.
file1='Joseph_front.csv';
file2='Joseph_side.csv';
file3='Joseph_knee.csv';
file4='Joseph_back.csv';
A1=initiate2(file1); %initiate2 function reads the csv and perfoms some filtering operations on image in .csv format
A2=initiate2(file2);
A3=initiate2(file3);
A4=initiate2(file4);
%%omitted large part of code
cal(1) = p+q+r*s;
cal(2) = p+q+r+s;
cal(3) = p+q+r-s;
cal=cal'
%code to write the calculation in excel file
excelfile= 'test.xlsx';
xlswrite(excelfile,ValuesInInches,'Joseph_data',posColumn);
Describing more i want my code to process for 30 people all at once by selecting and picking the files itself, although i have done this operation by making the same code as a function for each person, but that is not very efficient as when I have to make a small change I have to make it in every function that means one change needs to be edited in 30 functions. Please suggest an efficient way to do it.
Note: All persons .csv files are named in the same manner and exist in the current folder.
I am assuming all of your files in one directory and there's no other files .
This portion of code will get the available filenames.
listFiles = dir('path of the directory');
filenames = strings; % an empty string array to save the filenames
j = 1;
for i = 1:1:length(listFiles)
if ~listFiles(i).isdir % to avoid the directory names
filenames(j,1) = listFiles(i).name;
j = j+1;
end
end
Now, there's 4 file for each person. So the loop should take 4 files at a time.
for ii = 1:4:length(filenames)
file1=filename(i);
file2=filename(i+1);
file3=filename(i+2);
file4=filename(i+3);
%% continue with your code
end

How to import many csv files to tables in MATLAB with names that correspond to their filenames [duplicate]

This question already has answers here:
Dynamically name a struct variable in MATLAB
(2 answers)
Closed 6 years ago.
I have a folder full of csv data files that I want to import to MATLAB. The import function will be the same for each of them. Their names follow a pattern, e.g. 'a0-b0.csv', 'a0-b5.csv', 'a0-b10.csv', 'a0-b15.csv' ... 'a2-b0' etc.
I want to import all these files at once. I want to import them as tables with sensible names that correspond to the data they contain.
Doing something like
filepath = 'file/path/'
for (a = [0, 2])
for (b = [0:5:50])
filename = strcat(filepath, 'a', num2str(a), '-b', num2str(b), '.csv')
varname = strcat('a', num2str(a), '_b', num2str(b))
varname = importfile(filename, startrow, endrow)
end
end
made sense in concept to me.
As 'varname' itself is the variable, not the string it contains, this does not do what I want. I've seen a lot of answers to similar situations suggesting eval(), while simultaneously vehemently advocating against it.
eval() has side effects that make it annoying to implement. Everyone's intense aversion to it make me wonder if it's worth putting in the effort to try to implement it.
I can think of no other way to do this, however. I don't see how using arrays (the recommended alternative) would result in appropriate/relevant names.
Does anyone have a suggestion as to how to automatically import many csv files to tables in MATLAB with names that correspond to their filenames?
(Regarding duplicates: I haven't yet found a question that addresses the automatic creation of corresponding variable names, which is the main issue here)
As #transversality condition wrote in their comment, you should rethink you approach.
I'd suggest two steps:
If you are about to change files in and import htem again, create .mat file with already imported data and check hwther to rescan the folder or load the .mat file.
Decide whether you want to use struct class or cell class to contain your data.
The code for struct can be:
filepath = 'file/path/'; % define the data folder
FileNames=dir('filepath\*.csv'); % list all csv. files within filepath
N=numel(FileNames); % count the csv files
DATA(N)=struct('name','','data',[]); % preallocate DATA
for (ii=1:N)
Temp=regexp(FileNames(ii).name,'\.','split'); % separate file name and abbreviation
DATA(ii).name = Temp{1}; % save the file name
DATA(ii).data = importfile([filepath,'\',FileNames(ii).name], startrow, endrow); % save the data
end
The code for cell can be:
filepath = 'file/path/'; % define the data folder
FileNames=dir('filepath\*.csv'); % list all csv. files within filepath
N=numel(FileNames); % count the csv files
DATA=cell(2,N); % preallocate DATA
for (ii=1:N)
Temp=regexp(FileNames(ii).name,'\.','split'); % separate file name and abbreviation
DATA{1,ii} = Temp{1}; % save the file name
DATA{2,ii} = importfile([filepath,'\',FileNames(ii).name], startrow, endrow); % save the data
end
In both cases you have to find the appropriate line by, for example, DATA(find(strcmpi(DATA.name,'<name>))).data.
You can also use the cell approach to create struct. Suppose we've run the code for cell.
Then command
DataStruct=struct(DATA{:});
will allow you to access your data via Data.Struct.<filename> command directly.

Reading structured variable from MAT file

I am performing an analysis which involves simulation of over 1000 cases. I extracting lots of data for each case as well (about 70MB). Currently I am saving the results for each case as:
Vessel.TotalForce
Vessel.WindForce
Vessel.CurrentForce
Vessel.WaveForce
Vessel.ConnectionForce
...
Line1.EffectiveTension
Line1.X
Line1.Y
Line2.EfectiveTension
Line2.X
Line2.Y
...
save('CaseNo1.mat')
Now, I need to perform my analysis for CaseNo1.mat to CaseNo1000. Initially I planned to create a Database.mat file by loading all cases in it and then accessing any variable using h5read. This way Matlab doesn't need to load all the data at a time. However, I am concerned now that my database file will be too big.
Is there any way I can read the structured variables from individual case files for example CaseNo1.mat without loading the CaseNo1.mat file in memory.
Matlab examples shows loading just the variables directly from MAT file without loading the whole MAT file. But I am not sure how to read structures data the same way.
x=load('CaseNo1.mat','Line1.X')
says Line1.X not found. But it's there. The command is not correct to access the data. Also tried using h5read, but it says CaseNo1.mat is not an HDF5 file.
Can anyone help with this.
Apart from this, I would also appreciate if there is any suggestion about performing such data intensive analysis.
I was wrong! I'm leaving my old answer for context, though I've edited it to reference this one. I thought I had used matfile() in that way before, but I hadn't. I just did a thorough search and ran a few test cases. You've actually run into a limitation of the way Matlab handles and references structures stored in .mat files. There is, however, a solution. It does involve some refactoring of your original code, but it shouldn't be too egregious.
Vessel_TotalForce
Vessel_WindForce
Vessel_CurrentForce
Vessel_WaveForce
Vessel_ConnectionForce
...
Line1_EffectiveTension
Line1_X
Line1_Y
Line2_EfectiveTension
Line2_X
Line2_Y
...
save('CaseNo1.mat')
Then to access, just use matfile (or load) as you were before. Like so:
Vessel_WaveForce = load('CaseNo1.mat'', 'Vessel_WaveForce')
It's important to note that this restriction doesn't appear to be caused by anything you've chosen to do in your program, but rather is imposed by the way Matlab interacts with it's native storage files when they contain structures.
EDIT: This answer works, but doesn't actually solve the problem posed in OP's question. I thought I had used matfile to generate a handle that I could access, but I was wrong. See my other answer for details.
You could use matfile, like so:
myMatFileHandle = matfile('caseNo1.mat');
thisVessel = myMatFileHandle.vessel;
Also, from the little bit I can see, you seem to be on the right track for high-volume analysis. Just remember to use sparse when applicable, and generally avoid conditionals inside of loops if possible.
Good luck!
The objective of storing data in structured format is:
To be organized
Easy scripting post processor where looping through data under one data set it required.
To store structured dataset containing integer, floating and string variables in MAT file and to be able to read just the required variable using h5read command was sought. Matlab load command is not able to read variable beyond first level from stored data in a MAT file. The h5write couldn't write string variables. Hence needed a work around to solve this problem.
To do this I have used following method:
filename = 'myMatFile';
Vessel.TotalForce = %store some data
Vessel.WindForce = %store some data
Vessel.CurrentForce = %store some data
Vessel.WaveForce = %store some data
Vessel.ConnectionForce = %store some data
...
Lin1.LineType = 'Wire'
Line1.ArcLength_0.EffectiveTension = %store some data
Line1.ArcLength_50.EffectiveTension= %store some data
Line1.ArcLength_100.EffectiveTension= %store some data
Lin2.LineType = 'Chain'
Line2.ArcLength_0.EffectiveTension= %store some data
Line2.ArcLength_50.EffectiveTension= %store some data
Line2.ArcLength_100.EffectiveTension= %store some data
save([filename '_temp.mat']);
PointToMat=matfile([filename '.mat'],'Writable',true);
PointToMat.(char(filename)) = load([filename '_temp.mat']);
delete([filename '_temp.mat']);
Now to read from the MAT file created, we can use h5read as usual. To extract the EffectiveTension for Line1, ArcLength_0:
EffectiveTension = h5read([filename '.mat'],['/' filename '/Line1/ArcLength_0/EffectiveTension']);
For string variables, h5read returns decimal values corresponding to each character. To obtain the actual string I used:
name = char(h5read([filename '.mat'],['/' filename '/Line1/LineType']));
Tried this method on my data set which is about 200MB and I could process them pretty fast. Hope this would help someone someday.
Short answer:
Having saved the data into a MAT file with the '-v7.3' option, use something like h5read(filename, '/Line2/X') to read just one structure field. You can even read an array partially, for example:
s.a = 1:100;
save('test.mat', '-v7.3', 's');
clear
h5read('test.mat', '/s/a', [1 10], [1 5], [1 3])
returns each third element of the 1:100 array, starting with the 10th element and returning 5 values:
10 13 16 19 22
Long answer:
See answer by #Amitava for the more elaborate code and topic coverage.

Matlab: dynamic name for structure

I want to create a structure with a variable name in a matlab script. The idea is to extract a part of an input string filled by the user and to create a structure with this name. For example:
CompleteCaseName = input('s');
USER WRITES '2013-06-12_test001_blabla';
CompleteCaseName = '2013-06-12_test001_blabla'
casename(12:18) = struct('x','y','z');
In this example, casename(12:18) gives me the result test001.
I would like to do this to allow me to compare easily two cases by importing the results of each case successively. So I could write, for instance :
plot(test001.x,test001.y,test002.x,test002.y);
The problem is that the line casename(12:18) = struct('x','y','z'); is invalid for Matlab because it makes me change a string to a struct. All the examples I find with struct are based on a definition like
S = struct('x','y','z');
And I can't find a way to make a dynamical name for S based on a string.
I hope someone understood what I write :) I checked on the FAQ and with Google but I wasn't able to find the same problem.
Use a structure with a dynamic field name.
For example,
mydata.(casename(12:18)) = struct;
will give you a struct mydata with a field test001.
You can then later add your x, y, z fields to this.
You can use the fields later either by mydata.test001.x, or by mydata.(casename(12:18)).x.
If at all possible, try to stay away from using eval, as another answer suggests. It makes things very difficult to debug, and the example given there, which directly evals user input:
eval('%s = struct(''x'',''y'',''z'');',casename(12:18));
is even a security risk - what happens if the user types in a string where the selected characters are system(''rm -r /''); a? Something bad, that's what.
As I already commented, the best case scenario is when all your x and y vectors have same length. In this case you can store all data from the different files into 2 matrices and call plot(x,y) to plot each column as a series.
Alternatively, you can use a cell array such that:
c = cell(2,nufiles);
for ii = 1:numfiles
c{1,ii} = import x data from file ii
c{2,ii} = import y data from file ii
end
plot(c{:})
A structure, on the other hand
s.('test001').x = ...
s.('test001').y = ...
Use eval:
eval(sprintf('%s = struct(''x'',''y'',''z'');',casename(12:18)));
Edit: apologies, forgot the sprintf.

MATLAB missorting structure array when using dir command

I have a bunch of Excel data, called "1.xls", "2.xls"... until "15.xls", each with 141x44 sets of data. I am using the dir function to import the data into MATLAB.
Here I am importing the first and second columns from each file into A and B matrix.
prob15 = dir(fullfile('C:\Users\Bo Sun\Documents\MATLAB\prob15'),'.xls');
global A B
A=zeros(141,length(prob15));
B=zeros(141,length(prob15));
for i=1:length(prob15)
A(:,i) = xlsread(prob15(i).name,'A:A');
B(:,i) = xlsread(prob15(i).name,'B:B');
end
My problem is, when I use the dir command, for some reason MATLAB missorts the data, in that the ascending order of the prob15 structure array will be "1.xls", "10.xls", "11.xls"... instead of normal ascending numerical order ("1.xls", "2.xls, ...). Anyone know how I could fix this? Thanks.
The order you are seeing is called ascii-betical order and is the normal sorting order for all kinds of utilities, and evidently your OS directory listing program as well, since matlab just farms this command out to the OS.
If you want a numerical sort, you can convert the filename strings to numbers and sort those. Before I wrote it myself some light googling yielded this which you can easily adapt to your problem:
list = dir(fullfile(cd, '*.mat'));
name = {list.name};
str = sprintf('%s#', name{:});
num = sscanf(str, 'r_%d.mat#');
[dummy, index] = sort(num);
name = name(index);