How can I run a MATLAB script on .csv files in two separate folders at the same time? - matlab

So I have an iterative loop that extracts data from .csv files in MATLAB's active folder and plots it. I would like to take it one step further and run the script on two folders, each with their own .csv files.
One folder is called stress and the other strain. As the name implies, they contain .csv files for stress and strain data for several samples, each of which is called E3-01, E3-02, E3-03, etc. In other words, both folders have the same number of files and the same names.
The way I see it, the process would have the following steps:
Look in the stress folder, look inside file E3-01, extract the data in the column labelled Stress
Look in the strain folder, look inside file E3-01, extract the data in the column labelled Strain
Combine the data together for sample E3-01 and plot it
Repeat steps 1-3 for all files in the folders
Like I said, I already have a script that can find the right column and extract the data. What I'm not sure about is how to tell MATLAB to alternate the folder that the script is being run on.
Instead of a script, would a function be better? Something that accepts 4 inputs: the names of the two folders and the columns to extract?
EDIT: Apologies, here's the code I have so far:
clearvars;
files = dir('*.csv');
prompt = {'Plot name:','x label:','y label:','x values:','y values:','Points to eliminate:'};
dlg_title = 'Input';
num_lines = 1;
defaultans = {'Title','x label','y label','Surface component 1.avg(epsY) [True strain]','Stress','0'};
answer = inputdlg(prompt,dlg_title,num_lines,defaultans);
name_plot = answer{1};
x_label = answer{2};
y_label = answer{3};
x_col = answer{4};
y_col = answer{5};
des_cols = {y_col,x_col};
smallest_n = 100000;
points_elim = answer{6};
avg_x_values = [];
avg_y_values = [];
for file = files'
M=xlsread(file.name);
[row,col]=size(M);
if smallest_n > row
smallest_n = row;
end
end
smallest_n=smallest_n-points_elim;
avg_x_values = zeros(smallest_n,size(files,1));
avg_y_values = zeros(smallest_n,size(files,1));
hold on;
set(groot, 'DefaultLegendInterpreter', 'none');
set(gca,'FontSize',20);
ii = 0;
for file = files'
ii = ii + 1;
[n,s,r] = xlsread(file.name);
colhdrs = s(1,:);
[row, col] = find(strcmpi(s,x_col));
x_values = n(1:end-points_elim,col);
[row, col] = find(strcmpi(s,y_col));
y_values = n(1:end-points_elim,col);
plot(x_values,y_values,'DisplayName',s{1,1});
avg_x_values(:,ii)=x_values(1:smallest_n);
avg_y_values(:,ii)=y_values(1:smallest_n);
end
ylabel({y_label});
xlabel({x_label});
title({name_plot});
colormap(gray);
hold off;
avg_x_values = mean(avg_x_values,2);
avg_y_values = mean(avg_y_values,2);
plot(avg_x_values,avg_y_values);
set(gca,'FontSize',20);
ylabel({y_label});
xlabel({x_label});
title({name_plot});
EDIT 2: #Adriaan I tried to write the following function to get a column from a file:
function [out_col] = getcolumn(col,file)
file = dir(file);
[n,s,r] = xlsread(file.name);
colhdrs = s(1,:);
[row, col] = find(strcmpi(s,col));
out_col = n(1:end,col);
end
but I get the error
Function 'subsindex' is not defined for values of class 'struct'.
Error in getcolumn (line 21)
y = x(:,n);
not sure why.

You can do both, of course, and it depends on preference mainly, provided you're the sole user of the script. If others are going to use it as well, use functions instead, as they can contain a proper help file and calling help functionname will then give you that help.
For instance:
folders1 = dir(../strain/*)
folders2 = dir(../stress/*)
for ii 1 = 1:numel(folders)
operand1 = folders1{ii};
operand2 = folders2{ii};
%... rest of script
%
% Or function:
data = YourFunction(folders1{ii},folders2{ii})
end
So all in all you can use both, although from experience I find functions easier to use in the end, as you just pass parameters and don't need to trawl through the complete code to change the parameters each run.
Additionally you can partition off small parts of your program which do a fix task. If you nest your functions, and finally call just a single function in your scripts, you don't have to look at hundreds of lines of code each time you run the script, but rather can just run a single function (which can also be inside a script or function, ad infinitum).
Finally, a function has its own scope; meaning that any variables that are in that function stay within that function unless you explicitly set them as output (apart from global variables, but those are problematic anyway). This can be a good thing, or a bad thing, depending on the rest of your code. If you function would output ~20 variables for further processing, the function probably should include more steps. It'd be a good thing if you create lots of intermediate variables (I always do), because when the function's finished running, the scope of that function will be removed from memory, saving you clear tmpVar1 tmpVar2 tmpVar3 etc every few lines in your script.
For the script the argument in favour would be that it is easier to debug; you don't need dbstop on error and can step a bit easier through the script, keeping check of all your variables. But, after the debugging has been completed, this argument becomes moot, and thus in general I'd start with writing a script, and once it performs as desired, I rework it to a function at minimal extra effort.

Related

Matlab: read multiple files

my matlab script read several wav files contained in a folder.
Each read signal is saved in cell "mat" and each signal is saved in array. For example,
I have 3 wav files, I read these files and these signals are saved in arrays "a,b and c".
I want apply another function that has as input each signal (a, b and c) and the name of corresponding
file.
dirMask = '\myfolder\*.wav';
fileRoot = fileparts(dirMask);
Files=dir(dirMask);
N = natsortfiles({Files.name});
C = cell(size(N));
D = cell(size(N));
for k = 1:numel(N)
str =fullfile(fileRoot, Files(k).name);
[C{k},D{k}] = audioread(str);
mat = [C(:)];
fs = [D(:)];
a=mat{1};
b=mat{2};
c=mat{3};
myfunction(a,Files(1).name);
myfunction(b,Files(2).name);
myfunction(c,Files(3).name);
end
My script doesn't work because myfunction considers only the last Wav file contained in the folder, although
arrays a, b and c cointain the three different signal.
If I read only one wav file, the script works well. What's wrong in the for loop?
Like Cris noticed, you have some issues with how you structured your for loop. You are trying to use 'b', and 'c' before they are even given any data (in the second and third times through the loop). Assuming that you have a reason for structuring your program the way you do (I would rewrite the loop so that you do not use 'a','b', or 'c'. And just send 'myfunction' the appropriate index of 'mat') The following should work:
dirMask = '\myfolder\*.wav';
fileRoot = fileparts(dirMask);
Files=dir(dirMask);
N = natsortfiles({Files.name});
C = cell(size(N));
D = cell(size(N));
a = {};
b = {};
c = {};
for k = 1:numel(N)
str =fullfile(fileRoot, Files(k).name);
[C{k},D{k}] = audioread(str);
mat = [C(:)];
fs = [D(:)];
a=mat{1};
b=mat{2};
c=mat{3};
end
myfunction(a,Files(1).name);
myfunction(b,Files(2).name);
myfunction(c,Files(3).name);
EDIT
I wanted to take a moment to clarify what I meant by saying that I would not use the a, b, or c variables. Please note that I could be missing something in what you were asking so I might be explaining things you already know.
In a certain scenarios like this it is possible to articulate exactly how many variables you will be using. In your case, you know that you have exactly 3 audio files that you are going to process. So, variables a, b, and c can come out. Great, but what if you have to throw another audio file in? Now you need to go back in and add a 'd' variable and another call to 'myfunction'. There is a better way, that not only reduces complexity but also extends functionality to the program. See the following code:
%same as your code
dirMask = '\myfolder\*.wav';
fileRoot = fileparts(dirMask);
Files = dir(dirMask);
%slight variable name change, k->idx, slightly more meaningful.
%also removed N, simplifying things a little.
for idx = 1:numel(Files)
%meaningful variable name change str -> filepath.
filepath = fullfile(fileRoot, Files(idx).name);
%It was unclear if you were actually using the Fs component returned
%from the 'audioread' call. I wanted to make sure that we kept access
%to that data. Note that we have removed 'mat' and 'fs'. We can hold
%all of that data inside one variable, 'audio', which simplifies the
%program.
[audio{idx}.('data'), audio{idx}.('rate')] = audioread(filepath);
%this function call sends exactly the same data that your version did
%but note that we have to unpack it a little by adding the .('data').
myfunction(audio{idx}.('data'), Files(idx).name);
end

Saving data to .mat file in MATLAB

I'm new to MATLAB, and I can't manage to make my function work in order to save my data into a .mat file.
The input:
A structure, with 5 fields:
data: 3D matrix of 19x1000x143
labels: 1x143 matrix with 1 or -1 in it
subject_number: an integer
sampling_rate: an integer, 500 Hz
channel_names: 1x19 matrix with text in it
name: a string for the name of the file
clean: a matrix 1x143 with 1 or 0 in it.
The idea is to save only the clean data, marked as 1 in the clean matrix.
If clean(i) is equal to 1:
save data(:,:,i) and labels(:,i)
This is the code I've tried to implement in the saving.m file:
function saving(EEG_struct, clean, name)
subject_number = EEG_struct.subject_number;
fs = EEG_struct.sampling_rate;
chan_names = EEG_struct.channel_names;
nb_epoch = size(EEG_struct.data, 3);
for j=1:nb_epoch
if clean(j) == 1
% Keep the epoch and label
data = cat(3, data, EEG_struct.data(:,:,j));
labels = cat(2, labels, EEG_struct.labels(:,j));
end
end
save(name, data, labels, subject_number, fs, chan_names)
As you can see, I would like to save the data as a structure with the same shape as the EEG_struct input.
Moreover, I would like to use a parfor instead of a for, but it raised me an error I didn't quite get:
An UndefinedFunction error was thrown on the workers for 'data'. This might be because the file containing 'data' is not accessible on the workers. Use addAttachedFiles(pool, files) to specify the required files to be attached. See the documentation for 'parallel.Pool/addAttachedFiles' for more details. Caused by: Undefined function or variable 'data'.
Thanks for the help !
You can use your clean variable as a logical index and parse out your data and labels at once. So there is no need for a loop.
Also the save command needs the "names" of the vars to save not the variables themselves. So I just put ' ' around each one.
function saving(EEG_struct, clean, name)
subject_number = EEG_struct.subject_number;
fs = EEG_struct.sampling_rate;
chan_names = EEG_struct.channel_names;
nb_epoch = size(EEG_struct.data, 3);
%No need for a loop at all
data = EEG_struct.data(:,:,logical(clean));
labels = EEG_struct.labels(logical(clean)); %This is a 1xN so I removed the extra colon operator
save(name, 'data', 'labels', 'subject_number', 'fs', 'chan_names');
EDIT:
Per you comment if you want to just leave everything in the structure. I gave you 2 options for how to save it.
function saving(EEG_struct, clean, name)
%Crop out ~clead data
EEG_struct.data = EEG_struct.data(:,:,logical(clean));
EEG_struct.labels = EEG_struct.labels(logical(clean)); %This is a 1xN so I removed the extra colon operator
% Option 1
save(name, 'EEG_struct');
% Option2
save(name, '-struct', 'EEG_struct');
Option 1 will directly save the struct to the MAT file. So if you were to load the data back like this:
test = load(name);
test =
EEG_struct: [1x1 struct]
You would get your structure placed inside another structure ... which might not be ideal or require an extra line to de-nest it. On the other hand just loading the MAT file with no outputs load(name) would put EEG_struct into your current workspace. But if in a function then it sort of springs into existence without every being declared which makes code a bit harder to follow.
Option 2 uses the '-struct' option which breaks out each field automatically into separate vars in the MAT file. So loading like this:
EEG_struct = load(name);
Will put all the fields back together again. To me at least this looks cleaner when done within a function but is probably just my preference
So comment out which ever you prefer. Also, not I did not include clean in the save. You could either append it to the MAT or add it to your structure.
To get a structure the same as EEG_struct but with only the data/labels corresponding with the clean variable, you can simply make a copy of the existing structure and remove the rows where clean=0
function saving(EEG_struct, clean, name)
newstruct = EEG_struct;
newstruct.data(:,:,logical(~clean)) = '';
newstruct.labels(logical(~clean)) = '';
save(name,'newstruct');

for loop+structure allocation in matlab

This is a problem I am working on in Matlab.
I am looping through a series of *.ALL files and stripping the field name by a delimiter '.'. The first field is the network, second station name, third location code and fourth component. I pre-allocate my structure based on the number of files (3) I run through which for this example is a 3x3x3 structure that I would like to define as struct(station,loc code,component). You can see these definitions in my code example below.
I would like to loop through the station, loc code, and component and fill their values in the structure. The only problem is for some reason the way I've defined the loop it's actually looping through the files more than once. I only want to loop through each file once and extract the station, comp, and loc code from it and put it inside the structure. Because it's looping through the files more than once it's taken like 10 minutes to fill the structure. This is not very efficient at all. Can someone point out the culprit line for me or tell me what I'm doing incorrectly?
Here's my code below:
close all;
clear;
[files]=dir('*.ALL');
for i = 1:length(files)
fields=textscan(files(i).name, '%s', 'Delimiter', '.');
net{i,1}=fields{1}{:};
sta{i,1}=fields{1}{2};
loc{i,1}=fields{1}{3};
comp{i,1}=fields{1}{4};
data = [];
halfhour(1:2) = struct('data',data);
hour(1:24) = struct('halfhour',halfhour);
day(1:366) = struct('hour',hour);
PSD_YE_DBT(1:length(files),1:length(files),1:length(files)) =
struct('sta','','loc','','comp','','allData',[],'day',day);
end
for s=1:1:length(sta)
for l=1:1:length(loc)
for c=1:1:length(comp)
tempFileName = strcat(net{s},'.',sta{s},'.',loc{l},'.',comp{c},'.','ALL');
fid = fopen(tempFileName);
PSD_YE_DBT(s,l,c).sta = sta{s};
PSD_YE_DBT(s,l,c).loc = loc{l};
PSD_YE_DBT(s,l,c).comp = comp{c};
end
end
end
Example file names for the three files I'm working with are:
XO.GRUT.--.HHE.ALL
XO.GRUT.--.HHN.ALL
XO.GRUT.--.HHZ.ALL
Thanks in advance!

How do I retrieve the names of function parameters in matlab?

Aside from parsing the function file, is there a way to get the names of the input and output arguments to a function in matlab?
For example, given the following function file:
divide.m
function [value, remain] = divide(left, right)
value = floor(left / right);
remain = left / right - value;
end
From outside the function, I want to get an array of output arguments, here: ['value', 'remain'], and similarly for the input arguments: ['left', 'right'].
Is there an easy way to do this in matlab? Matlab usually seems to support reflection pretty well.
EDIT Background:
The aim of this is to present the function parameters in a window for the user to enter. I'm writing a kind of signal processing program, and functions to perform operations on these signals are stored in a subfolder. I already have a list and the names of each function from which the user can select, but some functions require additional arguments (e.g. a smooth function might take window size as a parameter).
At the moment, I can add a new function to the subfolder which the program will find, and the user can select it to perform an operation. What I'm missing is for the user to specify the input and output parameters, and here I've hit the hurdle here in that I can't find the names of the functions.
MATLAB offers a way to get information about class metadata (using the meta package), however this is only available for OOP classes not regular functions.
One trick is to write a class definition on the fly, which contain the source of the function you would like to process, and let MATLAB deal with the parsing of the source code (which can be tricky as you'd imagine: function definition line spans multiple lines, comments before the actual definition, etc...)
So the temporary file created in your case would look like:
classdef SomeTempClassName
methods
function [value, remain] = divide(left, right)
%# ...
end
end
end
which can be then passed to meta.class.fromName to parse for metadata...
Here is a quick-and-dirty implementation of this hack:
function [inputNames,outputNames] = getArgNames(functionFile)
%# get some random file name
fname = tempname;
[~,fname] = fileparts(fname);
%# read input function content as string
str = fileread(which(functionFile));
%# build a class containing that function source, and write it to file
fid = fopen([fname '.m'], 'w');
fprintf(fid, 'classdef %s; methods;\n %s\n end; end', fname, str);
fclose(fid);
%# terminating function definition with an end statement is not
%# always required, but now becomes required with classdef
missingEndErrMsg = 'An END might be missing, possibly matching CLASSDEF.';
c = checkcode([fname '.m']); %# run mlint code analyzer on file
if ismember(missingEndErrMsg,{c.message})
% append "end" keyword to class file
str = fileread([fname '.m']);
fid = fopen([fname '.m'], 'w');
fprintf(fid, '%s \n end', str);
fclose(fid);
end
%# refresh path to force MATLAB to detect new class
rehash
%# introspection (deal with cases of nested/sub-function)
m = meta.class.fromName(fname);
idx = find(ismember({m.MethodList.Name},functionFile));
inputNames = m.MethodList(idx).InputNames;
outputNames = m.MethodList(idx).OutputNames;
%# delete temp file when done
delete([fname '.m'])
end
and simply run as:
>> [in,out] = getArgNames('divide')
in =
'left'
'right'
out =
'value'
'remain'
If your problem is limited to the simple case where you want to parse the function declaration line of a primary function in a file (i.e. you won't be dealing with local functions, nested functions, or anonymous functions), then you can extract the input and output argument names as they appear in the file using some standard string operations and regular expressions. The function declaration line has a standard format, but you have to account for a few variations due to:
Varying amounts of white space or blank lines,
The presence of single-line or block comments, and
Having the declaration broken up on more than one line.
(It turns out that accounting for a block comment was the trickiest part...)
I've put together a function get_arg_names that will handle all the above. If you give it a path to the function file, it will return two cell arrays containing your input and output parameter strings (or empty cell arrays if there are none). Note that functions with variable input or output lists will simply list 'varargin' or 'varargout', respectively, for the variable names. Here's the function:
function [inputNames, outputNames] = get_arg_names(filePath)
% Open the file:
fid = fopen(filePath);
% Skip leading comments and empty lines:
defLine = '';
while all(isspace(defLine))
defLine = strip_comments(fgets(fid));
end
% Collect all lines if the definition is on multiple lines:
index = strfind(defLine, '...');
while ~isempty(index)
defLine = [defLine(1:index-1) strip_comments(fgets(fid))];
index = strfind(defLine, '...');
end
% Close the file:
fclose(fid);
% Create the regular expression to match:
matchStr = '\s*function\s+';
if any(defLine == '=')
matchStr = strcat(matchStr, '\[?(?<outArgs>[\w, ]*)\]?\s*=\s*');
end
matchStr = strcat(matchStr, '\w+\s*\(?(?<inArgs>[\w, ]*)\)?');
% Parse the definition line (case insensitive):
argStruct = regexpi(defLine, matchStr, 'names');
% Format the input argument names:
if isfield(argStruct, 'inArgs') && ~isempty(argStruct.inArgs)
inputNames = strtrim(textscan(argStruct.inArgs, '%s', ...
'Delimiter', ','));
else
inputNames = {};
end
% Format the output argument names:
if isfield(argStruct, 'outArgs') && ~isempty(argStruct.outArgs)
outputNames = strtrim(textscan(argStruct.outArgs, '%s', ...
'Delimiter', ','));
else
outputNames = {};
end
% Nested functions:
function str = strip_comments(str)
if strcmp(strtrim(str), '%{')
strip_comment_block;
str = strip_comments(fgets(fid));
else
str = strtok([' ' str], '%');
end
end
function strip_comment_block
str = strtrim(fgets(fid));
while ~strcmp(str, '%}')
if strcmp(str, '%{')
strip_comment_block;
end
str = strtrim(fgets(fid));
end
end
end
This is going to be very hard (read: impossible) to do for general functions (think of things like varargin, etc). Also, in general, relying on variable names as a form of documentation might be... not what you want. I'm going to suggest a different approach.
Since you control the program, what about specifying each module not just with the m-file, but also with a table entry with extra information. You could document the extra parameters, the function itself, notate when options are booleans and present them as checkboxes, etc.
Now, where to put this? I would suggest to have the main m-file function return the structure, as sort of a module loading step, with a function handle that points to the subfunction (or nested function) that does the real work. This preserves the single-file setup that I'm sure you want to keep, and makes for a much more configurable setup for your modules.
function module = divide_load()
module.fn = #my_divide;
module.name = 'Divide';
module.description = 'Divide two signals';
module.param(1).name = 'left';
module.param(1).description = 'left signal';
module.param(1).required_shape = 'columnvector';
% Etc, etc.
function [value, remain] = my_divide(left, right)
value = floor(left / right);
remain = left / right - value;
end
end
When you can't get information from a programming langauge about its contents (e.g., "reflection"), you have to step outside the language.
Another poster suggested "regular expressions", which always fail when applied to parsing real programs because regexps cannot parse context free langauges.
To do this reliably, you need a real M language parser, that will give you access to the parse tree. Then this is fairly easy.
Our DMS Software Reengineering Toolkit has an M language parser available for it, and could do this.
Have you considered using map containers?
You can write your functions along these lines . . .
function [outMAP] = divide(inMAP)
outMAP = containers.Map();
outMAP('value') = floor(inMAP('left') / inMAP('right'));
outMAP('remain') = inMAP('left') / inMAP('right') - outMAP('value');
end
...and call them like this ...
inMAP = containers.Map({'left', 'right'}, {4, 5});
outMAP = divide(inMAP);
...and then simply examine tha variable names using the following syntax...
>> keys(inMAP)
ans =
'left' 'right'
inputname(argnum) http://www.mathworks.com/help/techdoc/ref/inputname.html .

matlab code source

how to write a program in matlab that reads a certain number of images let's say 20 for example which are saved in a given directory (C:) such that later i can use them. suppose that the images are saved by numbers. later, i am gonna use them.
I'd have the code look something like this. Assuming cell array im holds your images.
Write out:
IMG_DIR = 'C:\';
filename_root = 'image';
IMG_EXT = '.jpg';
NUM_IMAGES = 20;
for i = 1:NUM_IMAGES
imwrite(im{i}, [IMG_DIR filename_root num2str(i) IMG_EXT]);
end
Read in:
for i = 1:NUM_IMAGES
im{i} = imread([IMG_DIR filename_root num2str(i) IMG_EXT]);
end
If you don't know how many there are, you can also use ls command (works differently in Windows vs. Linux).
If you don't know, in advance, which files will be in there, but you know that they have the string in them, 'rawImage' (like 'rawImage001.jpg' etc.) you can do something like
a = dir('c:\temp');
requiredBaseFileName = 'rawImage'; % you want them to contain the substring 'rawImage'
for i = 1:length(a),
fileName = a(i).name;
if(isempty(strfind(fileName,'.jpg')) & isempty(strfind(fileName,'.png')))
continue;
end
if(isempty(strfind(fileName,requiredBaseFileName)))
continue;
end
% do your processing here
end