Extract information from path name - matlab

I want to make a script in MATLAB that saves my output data with a certain name. All information for this name is in the path from the input data, like it is shown here:
path = 'C:\projektions100\algorithm1\method_A\data1';
projection =
algorithm =
method =
data =
The script then should extract the text in the path with the keyword (f.e. method) from the adjacent backslashes so the script is more flexible in case I made a spelling mistake with some folder names.
This is what I found to extract a text between a start and a end point but I cannot simply use the backslashes since there are a few of them in the path.
How should I proceed?

You can simply use a regexp with named tokens:
>> path = 'C:\projektions100\algorithm1\method_A\data1';
>> all=regexp(path,'[^\\]+\\proje[ck]tion(?<projection>[^\\]+)\\algorithm(?<algorithm>[^\\]+)\\method(?<method>[^\\]+)\\data(?<data>.+$)','names')
all =
struct with fields:
projection: 's100'
algorithm: '1'
method: '_A'
data: '1'

The problem is on how to find the end of your keywords. Here is a bit code, which loops through the keywords and looks for them in the path (stored in p2fldr, because the variable path returns the working path in MATLAB and you overshadow it if you define it).
p2fldr = 'C:\projektions100\algorithm1\method_A\data1';
% keywords
kyWrd = {'projection','algorithm','method','data'};
Tag = cell(size(kyWrd));
for i = 1:length(kyWrd)
% get keyword
ky = kyWrd{i};
% look for it in the path
idx = strfind(p2fldr,ky);
if ~isempty(idx)
% remaining path
idx_offset = idx+strlength(ky);
prm = p2fldr(idx_offset:end);
% look for file separator '\'
idx_tmp = strfind(prm,filesep);
% if you don't find one, it is pabably the last entry, so take the
% length
if isempty(idx_tmp)
idx_tmp = length(prm)+1;
end
% this is the index where it ends
idx2 = idx_tmp(1)-1;
% assign to tag-cell
Tag{i} = prm(1:idx2);
end
end
You can build a shortcut if you know that they are always in the last 4 entries of your path, so you can use strsplit right away and index the last returned cells
str_splt = strsplit(p2fldr,filesep);
Tag = cell(size(kyWrd));
for i = 1:length(kyWrd)
% index cells
str = str_splt{end-length(kyWrd)+i};
% get keyword
ky = kyWrd{i};
Tag{i} = str(length(ky)+1:end);
end
Note that this does not care if it matches your keywords (e.g. your path says 'projektions' but I defined the keyword to be 'projection')

Related

searching matlab search path with strcmp

I have written the code below. Basically it is checking if some path is already in the matlab search path or not. If it is not found then it adds the path.
The problem is the strcmp always returns a vector of zeros despite the path actually already existing in currPath. I actually copied a path from currPath to check I was getting the correct values. Not sure why this is?
% get current path
currPath = strsplit(path, ';')';
currPath = upper(currPath);
% check if required paths exist - if not add them
pathsToCheck = ['C:\SOMEFOLDER\MADEUP'];
pathsToCheck = upper(pathsToCheck);
for n = 1 : length(pathsToCheck(:, 1))
index = strcmp(currPath, pathsToCheck(n, 1));
if sum(index) > 0
addpath(pathsToCheck{t, 1}, '-end'); % add path to the end
end
end
% save changes
savepath;
The issue is that you have defined pathsToCheck as a character array and not a cell array (which I think is what you intended the way that you are looping through it).
Rather than using a for loop, you could use ismember to check which members of a cell array of strings exist in another cell array of strings.
% Note the use of pathsep to make this work across multiple operating systems
currentPath = strsplit(path, pathsep);
pathsToCheck = {'C:\SOMEFOLDER\MADEUP'};
exists = ismember(pathsToCheck, currentPath);
% If you want to ignore case: ismember(upper(pathsToCheck), upper(currentPath))
% Add the ones that didn't exist
addpath(pathsToCheck{~exists}, '-end');

MATLAB select variables from the workspace with a specific name

I would like to select all the variables in my workspace whos name follow a specific pattern. For example, I would like to compute the mean for all the variables in my workspace starting with the name my_vars.
I tried the following code:
a = who('-regexp','my_vars*')
result = mean(eval(a))
Howevever the eval function doesn't work for cells. Is there any work arround?
who returned a cell array of char arrays (i.e. strings), with each element containing one variable name. You need to convert that to a string containing a comma-separated list of the names. Here's one way to do that:
my_vars1 = 1; my_vars2 = 2; my_vars3 = 3;
names = who('-regexp', 'my_vars*');
namelist = sprintf('%s,', names{:}); % sprintf reuses the format string if
% there are more inputs than format specifiers
namelist(end)=[]; % strip last comma
eval(sprintf('mean([%s])', namelist))
ans =
2

Reconstruct directories from file MATLAB

Thanks for your help.
The problem is:
I need the user to select a file based on an extension lets say .tif. I used the standard method, i.e.
[flnm,locn]=uigetfile({'*.tif','Image files'}, 'Select an image');
ext = '.tif';
But I need to fetch other image files from other subdirectories. Say the directory name returned to locn is: /user/blade/checklist/exp1/trial_1/run_1/exp001.tif. Image goes to exp100.tif.
I want to access:
/user/blade/checklist/exp1/trial_1/run_2/exp001.tif.
Also access:
/user/blade/checklist/exp1/trial_2/run_2/exp001.tif.
Up to trial_n
But if I list directory in /user/blade/checklist/exp1/, I get all folders therein from where I can reconstruct the right path. The naming structure is orderly.
My current solution is
[flnm,locn]=uigetfile({'*.tif','Image files'}, 'Select an image');
ext = '.tif';
parts = strsplit(locn, '/');
f = fullfile(((parts{end-5}),(parts{end-4}),(parts{end-3}),(parts{end-2}),(parts{end-1}));
Which is really ugly and I also lose the first /. Any help is appreciated.
Thanks!
First, get the file location as you did; note a small change I've made to make use of the variable ext.
ext = '.txt';
[flnm,locn]=uigetfile({['*',ext]}, 'Select an image');
parts = strsplit(locn,'/');
root = parts(1:end-4);
parts has 2 information - 1) path of the selected file; 2) path of your working folder, checklist, which you need. So root has the working folder.
Then, list out all the files you wanted, and put them in a cell array.
The file names should contain partial (subfolder) paths; it's not difficult to follow the pattern.
flist = {'trial_1/run_1/exp001.tif', ...
'trial_1/run_1/exp002.tif', ...
'trial_1/run_2/exp001.tif', ...
'trial_2/run_1/exp001.tif', ...
'trial_2/run_2/exp001.tif'};
I just enumerated a few; you can use a for loop to automatically generate trial_n and expxxx.tif. An example code to generate the complete file list (but not "full paths") -
flist = cell(10*2*100,1);
for ii = 1:10
for jj = 1:2
for kk = 1:100
flist{sub2ind([10,2,100],ii,jj,kk)} = ...
sprintf('trial_%d/run_%d/exp%03d%s', ii,...
jj, kk, ext);
end
end
end
Finally, use strjoin to concatenate the first part (your working folder) and second part (needed files in subfolders). Use cellfun to call strjoin for each cell in the file list cell array, so for every file you want you get a full path.
full_flist = cellfun(#(x) strjoin([root, x],'/'), ...
flist, 'UniformOutput', false);
Example output -
>> locn
locn =
/home/user/Downloads/exp1/trial_1/run_1/
>> for ii = 1:5
full_flist{ii}
end
ans =
/home/user/Downloads/trial_1/run_1/exp001.tif
ans =
/home/user/Downloads/trial_1/run_1/exp002.tif
ans =
/home/user/Downloads/trial_1/run_2/exp001.tif
ans =
/home/user/Downloads/trial_2/run_1/exp001.tif
ans =
/home/user/Downloads/trial_2/run_2/exp001.tif
>>
Note: You can either use
strjoin(str1, str2, '/')
or
sprintf('%s/%s', str1, str2)
They are equivalent.

Using a string to refer to a structure array - matlab

I am trying to take the averages of a pretty large set of data, so i have created a function to do exactly that.
The data is stored in some struct1.struct2.data(:,column)
there are 4 struct1 and each of these have between 20 and 30 sub-struct2
the data that I want to average is always stored in column 7 and I want to output the average of each struct2.data(:,column) into a 2xN array/double (column 1 of this output is a reference to each sub-struct2 column 2 is the average)
The omly problem is, I can't find a way (lots and lots of reading) to point at each structure properly. I am using a string to refer to the structures, but I get error Attempt to reference field of non-structure array. So clearly it doesn't like this. Here is what I used. (excuse the inelegence)
function [avrg] = Takemean(prefix,numslits)
% place holder arrays
avs = [];
slits = [];
% iterate over the sub-struct (struct2)
for currslit=1:numslits
dataname = sprintf('%s_slit_%02d',prefix,currslit);
% slap the average and slit ID on the end
avs(end+1) = mean(prefix.dataname.data(:,7));
slits(end+1) = currslit;
end
% transpose the arrays
avs = avs';
slits = slits';
avrg = cat(2,slits,avs); % slap them together
It falls over at this line avs(end+1) = mean(prefix.dataname.data,7); because as you can see, prefix and dataname are strings. So, after hunting around I tried making these strings variables with genvarname() still no luck!
I have spent hours on what should have been 5min of coding. :'(
Edit: Oh prefix is a string e.g. 'Hs' and the structure of the structures (lol) is e.g. Hs.Hs_slit_XX.data() where XX is e.g. 01,02,...27
Edit: If I just run mean(Hs.Hs_slit_01.data(:,7)) it works fine... but then I cant iterate over all of the _slit_XX
If you simply want to iterate over the fields with the name pattern <something>_slit_<something>, you need neither the prefix string nor numslits for this. Pass the actual structure to your function, extract the desired fields and then itereate them:
function avrg = Takemean(s)
%// Extract only the "_slit_" fields
names = fieldnames(s);
names = names(~cellfun('isempty', strfind(names, '_slit_')));
%// Iterate over fields and calculate means
avrg = zeros(numel(names), 2);
for k = 1:numel(names)
avrg(k, :) = [k, mean(s.(names{k}).data(:, 7))];
end
This method uses dynamic field referencing to access fields in structs using strings.
First of all, think twice before you use string construction to access variables.
If you really really need it, here is how it can be used:
a.b=123;
s1 = 'a';
s2 = 'b';
eval([s1 '.' s2])
In your case probably something like:
Hs.Hs_slit_01.data= rand(3,7);
avs = [];
dataname = 'Hs_slit_01';
prefix = 'Hs';
eval(['avs(end+1) = mean(' prefix '.' dataname '.data(:,7))'])

How do I retrieve the names of function parameters in matlab?

Aside from parsing the function file, is there a way to get the names of the input and output arguments to a function in matlab?
For example, given the following function file:
divide.m
function [value, remain] = divide(left, right)
value = floor(left / right);
remain = left / right - value;
end
From outside the function, I want to get an array of output arguments, here: ['value', 'remain'], and similarly for the input arguments: ['left', 'right'].
Is there an easy way to do this in matlab? Matlab usually seems to support reflection pretty well.
EDIT Background:
The aim of this is to present the function parameters in a window for the user to enter. I'm writing a kind of signal processing program, and functions to perform operations on these signals are stored in a subfolder. I already have a list and the names of each function from which the user can select, but some functions require additional arguments (e.g. a smooth function might take window size as a parameter).
At the moment, I can add a new function to the subfolder which the program will find, and the user can select it to perform an operation. What I'm missing is for the user to specify the input and output parameters, and here I've hit the hurdle here in that I can't find the names of the functions.
MATLAB offers a way to get information about class metadata (using the meta package), however this is only available for OOP classes not regular functions.
One trick is to write a class definition on the fly, which contain the source of the function you would like to process, and let MATLAB deal with the parsing of the source code (which can be tricky as you'd imagine: function definition line spans multiple lines, comments before the actual definition, etc...)
So the temporary file created in your case would look like:
classdef SomeTempClassName
methods
function [value, remain] = divide(left, right)
%# ...
end
end
end
which can be then passed to meta.class.fromName to parse for metadata...
Here is a quick-and-dirty implementation of this hack:
function [inputNames,outputNames] = getArgNames(functionFile)
%# get some random file name
fname = tempname;
[~,fname] = fileparts(fname);
%# read input function content as string
str = fileread(which(functionFile));
%# build a class containing that function source, and write it to file
fid = fopen([fname '.m'], 'w');
fprintf(fid, 'classdef %s; methods;\n %s\n end; end', fname, str);
fclose(fid);
%# terminating function definition with an end statement is not
%# always required, but now becomes required with classdef
missingEndErrMsg = 'An END might be missing, possibly matching CLASSDEF.';
c = checkcode([fname '.m']); %# run mlint code analyzer on file
if ismember(missingEndErrMsg,{c.message})
% append "end" keyword to class file
str = fileread([fname '.m']);
fid = fopen([fname '.m'], 'w');
fprintf(fid, '%s \n end', str);
fclose(fid);
end
%# refresh path to force MATLAB to detect new class
rehash
%# introspection (deal with cases of nested/sub-function)
m = meta.class.fromName(fname);
idx = find(ismember({m.MethodList.Name},functionFile));
inputNames = m.MethodList(idx).InputNames;
outputNames = m.MethodList(idx).OutputNames;
%# delete temp file when done
delete([fname '.m'])
end
and simply run as:
>> [in,out] = getArgNames('divide')
in =
'left'
'right'
out =
'value'
'remain'
If your problem is limited to the simple case where you want to parse the function declaration line of a primary function in a file (i.e. you won't be dealing with local functions, nested functions, or anonymous functions), then you can extract the input and output argument names as they appear in the file using some standard string operations and regular expressions. The function declaration line has a standard format, but you have to account for a few variations due to:
Varying amounts of white space or blank lines,
The presence of single-line or block comments, and
Having the declaration broken up on more than one line.
(It turns out that accounting for a block comment was the trickiest part...)
I've put together a function get_arg_names that will handle all the above. If you give it a path to the function file, it will return two cell arrays containing your input and output parameter strings (or empty cell arrays if there are none). Note that functions with variable input or output lists will simply list 'varargin' or 'varargout', respectively, for the variable names. Here's the function:
function [inputNames, outputNames] = get_arg_names(filePath)
% Open the file:
fid = fopen(filePath);
% Skip leading comments and empty lines:
defLine = '';
while all(isspace(defLine))
defLine = strip_comments(fgets(fid));
end
% Collect all lines if the definition is on multiple lines:
index = strfind(defLine, '...');
while ~isempty(index)
defLine = [defLine(1:index-1) strip_comments(fgets(fid))];
index = strfind(defLine, '...');
end
% Close the file:
fclose(fid);
% Create the regular expression to match:
matchStr = '\s*function\s+';
if any(defLine == '=')
matchStr = strcat(matchStr, '\[?(?<outArgs>[\w, ]*)\]?\s*=\s*');
end
matchStr = strcat(matchStr, '\w+\s*\(?(?<inArgs>[\w, ]*)\)?');
% Parse the definition line (case insensitive):
argStruct = regexpi(defLine, matchStr, 'names');
% Format the input argument names:
if isfield(argStruct, 'inArgs') && ~isempty(argStruct.inArgs)
inputNames = strtrim(textscan(argStruct.inArgs, '%s', ...
'Delimiter', ','));
else
inputNames = {};
end
% Format the output argument names:
if isfield(argStruct, 'outArgs') && ~isempty(argStruct.outArgs)
outputNames = strtrim(textscan(argStruct.outArgs, '%s', ...
'Delimiter', ','));
else
outputNames = {};
end
% Nested functions:
function str = strip_comments(str)
if strcmp(strtrim(str), '%{')
strip_comment_block;
str = strip_comments(fgets(fid));
else
str = strtok([' ' str], '%');
end
end
function strip_comment_block
str = strtrim(fgets(fid));
while ~strcmp(str, '%}')
if strcmp(str, '%{')
strip_comment_block;
end
str = strtrim(fgets(fid));
end
end
end
This is going to be very hard (read: impossible) to do for general functions (think of things like varargin, etc). Also, in general, relying on variable names as a form of documentation might be... not what you want. I'm going to suggest a different approach.
Since you control the program, what about specifying each module not just with the m-file, but also with a table entry with extra information. You could document the extra parameters, the function itself, notate when options are booleans and present them as checkboxes, etc.
Now, where to put this? I would suggest to have the main m-file function return the structure, as sort of a module loading step, with a function handle that points to the subfunction (or nested function) that does the real work. This preserves the single-file setup that I'm sure you want to keep, and makes for a much more configurable setup for your modules.
function module = divide_load()
module.fn = #my_divide;
module.name = 'Divide';
module.description = 'Divide two signals';
module.param(1).name = 'left';
module.param(1).description = 'left signal';
module.param(1).required_shape = 'columnvector';
% Etc, etc.
function [value, remain] = my_divide(left, right)
value = floor(left / right);
remain = left / right - value;
end
end
When you can't get information from a programming langauge about its contents (e.g., "reflection"), you have to step outside the language.
Another poster suggested "regular expressions", which always fail when applied to parsing real programs because regexps cannot parse context free langauges.
To do this reliably, you need a real M language parser, that will give you access to the parse tree. Then this is fairly easy.
Our DMS Software Reengineering Toolkit has an M language parser available for it, and could do this.
Have you considered using map containers?
You can write your functions along these lines . . .
function [outMAP] = divide(inMAP)
outMAP = containers.Map();
outMAP('value') = floor(inMAP('left') / inMAP('right'));
outMAP('remain') = inMAP('left') / inMAP('right') - outMAP('value');
end
...and call them like this ...
inMAP = containers.Map({'left', 'right'}, {4, 5});
outMAP = divide(inMAP);
...and then simply examine tha variable names using the following syntax...
>> keys(inMAP)
ans =
'left' 'right'
inputname(argnum) http://www.mathworks.com/help/techdoc/ref/inputname.html .