Aside from parsing the function file, is there a way to get the names of the input and output arguments to a function in matlab?
For example, given the following function file:
divide.m
function [value, remain] = divide(left, right)
value = floor(left / right);
remain = left / right - value;
end
From outside the function, I want to get an array of output arguments, here: ['value', 'remain'], and similarly for the input arguments: ['left', 'right'].
Is there an easy way to do this in matlab? Matlab usually seems to support reflection pretty well.
EDIT Background:
The aim of this is to present the function parameters in a window for the user to enter. I'm writing a kind of signal processing program, and functions to perform operations on these signals are stored in a subfolder. I already have a list and the names of each function from which the user can select, but some functions require additional arguments (e.g. a smooth function might take window size as a parameter).
At the moment, I can add a new function to the subfolder which the program will find, and the user can select it to perform an operation. What I'm missing is for the user to specify the input and output parameters, and here I've hit the hurdle here in that I can't find the names of the functions.
MATLAB offers a way to get information about class metadata (using the meta package), however this is only available for OOP classes not regular functions.
One trick is to write a class definition on the fly, which contain the source of the function you would like to process, and let MATLAB deal with the parsing of the source code (which can be tricky as you'd imagine: function definition line spans multiple lines, comments before the actual definition, etc...)
So the temporary file created in your case would look like:
classdef SomeTempClassName
methods
function [value, remain] = divide(left, right)
%# ...
end
end
end
which can be then passed to meta.class.fromName to parse for metadata...
Here is a quick-and-dirty implementation of this hack:
function [inputNames,outputNames] = getArgNames(functionFile)
%# get some random file name
fname = tempname;
[~,fname] = fileparts(fname);
%# read input function content as string
str = fileread(which(functionFile));
%# build a class containing that function source, and write it to file
fid = fopen([fname '.m'], 'w');
fprintf(fid, 'classdef %s; methods;\n %s\n end; end', fname, str);
fclose(fid);
%# terminating function definition with an end statement is not
%# always required, but now becomes required with classdef
missingEndErrMsg = 'An END might be missing, possibly matching CLASSDEF.';
c = checkcode([fname '.m']); %# run mlint code analyzer on file
if ismember(missingEndErrMsg,{c.message})
% append "end" keyword to class file
str = fileread([fname '.m']);
fid = fopen([fname '.m'], 'w');
fprintf(fid, '%s \n end', str);
fclose(fid);
end
%# refresh path to force MATLAB to detect new class
rehash
%# introspection (deal with cases of nested/sub-function)
m = meta.class.fromName(fname);
idx = find(ismember({m.MethodList.Name},functionFile));
inputNames = m.MethodList(idx).InputNames;
outputNames = m.MethodList(idx).OutputNames;
%# delete temp file when done
delete([fname '.m'])
end
and simply run as:
>> [in,out] = getArgNames('divide')
in =
'left'
'right'
out =
'value'
'remain'
If your problem is limited to the simple case where you want to parse the function declaration line of a primary function in a file (i.e. you won't be dealing with local functions, nested functions, or anonymous functions), then you can extract the input and output argument names as they appear in the file using some standard string operations and regular expressions. The function declaration line has a standard format, but you have to account for a few variations due to:
Varying amounts of white space or blank lines,
The presence of single-line or block comments, and
Having the declaration broken up on more than one line.
(It turns out that accounting for a block comment was the trickiest part...)
I've put together a function get_arg_names that will handle all the above. If you give it a path to the function file, it will return two cell arrays containing your input and output parameter strings (or empty cell arrays if there are none). Note that functions with variable input or output lists will simply list 'varargin' or 'varargout', respectively, for the variable names. Here's the function:
function [inputNames, outputNames] = get_arg_names(filePath)
% Open the file:
fid = fopen(filePath);
% Skip leading comments and empty lines:
defLine = '';
while all(isspace(defLine))
defLine = strip_comments(fgets(fid));
end
% Collect all lines if the definition is on multiple lines:
index = strfind(defLine, '...');
while ~isempty(index)
defLine = [defLine(1:index-1) strip_comments(fgets(fid))];
index = strfind(defLine, '...');
end
% Close the file:
fclose(fid);
% Create the regular expression to match:
matchStr = '\s*function\s+';
if any(defLine == '=')
matchStr = strcat(matchStr, '\[?(?<outArgs>[\w, ]*)\]?\s*=\s*');
end
matchStr = strcat(matchStr, '\w+\s*\(?(?<inArgs>[\w, ]*)\)?');
% Parse the definition line (case insensitive):
argStruct = regexpi(defLine, matchStr, 'names');
% Format the input argument names:
if isfield(argStruct, 'inArgs') && ~isempty(argStruct.inArgs)
inputNames = strtrim(textscan(argStruct.inArgs, '%s', ...
'Delimiter', ','));
else
inputNames = {};
end
% Format the output argument names:
if isfield(argStruct, 'outArgs') && ~isempty(argStruct.outArgs)
outputNames = strtrim(textscan(argStruct.outArgs, '%s', ...
'Delimiter', ','));
else
outputNames = {};
end
% Nested functions:
function str = strip_comments(str)
if strcmp(strtrim(str), '%{')
strip_comment_block;
str = strip_comments(fgets(fid));
else
str = strtok([' ' str], '%');
end
end
function strip_comment_block
str = strtrim(fgets(fid));
while ~strcmp(str, '%}')
if strcmp(str, '%{')
strip_comment_block;
end
str = strtrim(fgets(fid));
end
end
end
This is going to be very hard (read: impossible) to do for general functions (think of things like varargin, etc). Also, in general, relying on variable names as a form of documentation might be... not what you want. I'm going to suggest a different approach.
Since you control the program, what about specifying each module not just with the m-file, but also with a table entry with extra information. You could document the extra parameters, the function itself, notate when options are booleans and present them as checkboxes, etc.
Now, where to put this? I would suggest to have the main m-file function return the structure, as sort of a module loading step, with a function handle that points to the subfunction (or nested function) that does the real work. This preserves the single-file setup that I'm sure you want to keep, and makes for a much more configurable setup for your modules.
function module = divide_load()
module.fn = #my_divide;
module.name = 'Divide';
module.description = 'Divide two signals';
module.param(1).name = 'left';
module.param(1).description = 'left signal';
module.param(1).required_shape = 'columnvector';
% Etc, etc.
function [value, remain] = my_divide(left, right)
value = floor(left / right);
remain = left / right - value;
end
end
When you can't get information from a programming langauge about its contents (e.g., "reflection"), you have to step outside the language.
Another poster suggested "regular expressions", which always fail when applied to parsing real programs because regexps cannot parse context free langauges.
To do this reliably, you need a real M language parser, that will give you access to the parse tree. Then this is fairly easy.
Our DMS Software Reengineering Toolkit has an M language parser available for it, and could do this.
Have you considered using map containers?
You can write your functions along these lines . . .
function [outMAP] = divide(inMAP)
outMAP = containers.Map();
outMAP('value') = floor(inMAP('left') / inMAP('right'));
outMAP('remain') = inMAP('left') / inMAP('right') - outMAP('value');
end
...and call them like this ...
inMAP = containers.Map({'left', 'right'}, {4, 5});
outMAP = divide(inMAP);
...and then simply examine tha variable names using the following syntax...
>> keys(inMAP)
ans =
'left' 'right'
inputname(argnum) http://www.mathworks.com/help/techdoc/ref/inputname.html .
Related
So I have an iterative loop that extracts data from .csv files in MATLAB's active folder and plots it. I would like to take it one step further and run the script on two folders, each with their own .csv files.
One folder is called stress and the other strain. As the name implies, they contain .csv files for stress and strain data for several samples, each of which is called E3-01, E3-02, E3-03, etc. In other words, both folders have the same number of files and the same names.
The way I see it, the process would have the following steps:
Look in the stress folder, look inside file E3-01, extract the data in the column labelled Stress
Look in the strain folder, look inside file E3-01, extract the data in the column labelled Strain
Combine the data together for sample E3-01 and plot it
Repeat steps 1-3 for all files in the folders
Like I said, I already have a script that can find the right column and extract the data. What I'm not sure about is how to tell MATLAB to alternate the folder that the script is being run on.
Instead of a script, would a function be better? Something that accepts 4 inputs: the names of the two folders and the columns to extract?
EDIT: Apologies, here's the code I have so far:
clearvars;
files = dir('*.csv');
prompt = {'Plot name:','x label:','y label:','x values:','y values:','Points to eliminate:'};
dlg_title = 'Input';
num_lines = 1;
defaultans = {'Title','x label','y label','Surface component 1.avg(epsY) [True strain]','Stress','0'};
answer = inputdlg(prompt,dlg_title,num_lines,defaultans);
name_plot = answer{1};
x_label = answer{2};
y_label = answer{3};
x_col = answer{4};
y_col = answer{5};
des_cols = {y_col,x_col};
smallest_n = 100000;
points_elim = answer{6};
avg_x_values = [];
avg_y_values = [];
for file = files'
M=xlsread(file.name);
[row,col]=size(M);
if smallest_n > row
smallest_n = row;
end
end
smallest_n=smallest_n-points_elim;
avg_x_values = zeros(smallest_n,size(files,1));
avg_y_values = zeros(smallest_n,size(files,1));
hold on;
set(groot, 'DefaultLegendInterpreter', 'none');
set(gca,'FontSize',20);
ii = 0;
for file = files'
ii = ii + 1;
[n,s,r] = xlsread(file.name);
colhdrs = s(1,:);
[row, col] = find(strcmpi(s,x_col));
x_values = n(1:end-points_elim,col);
[row, col] = find(strcmpi(s,y_col));
y_values = n(1:end-points_elim,col);
plot(x_values,y_values,'DisplayName',s{1,1});
avg_x_values(:,ii)=x_values(1:smallest_n);
avg_y_values(:,ii)=y_values(1:smallest_n);
end
ylabel({y_label});
xlabel({x_label});
title({name_plot});
colormap(gray);
hold off;
avg_x_values = mean(avg_x_values,2);
avg_y_values = mean(avg_y_values,2);
plot(avg_x_values,avg_y_values);
set(gca,'FontSize',20);
ylabel({y_label});
xlabel({x_label});
title({name_plot});
EDIT 2: #Adriaan I tried to write the following function to get a column from a file:
function [out_col] = getcolumn(col,file)
file = dir(file);
[n,s,r] = xlsread(file.name);
colhdrs = s(1,:);
[row, col] = find(strcmpi(s,col));
out_col = n(1:end,col);
end
but I get the error
Function 'subsindex' is not defined for values of class 'struct'.
Error in getcolumn (line 21)
y = x(:,n);
not sure why.
You can do both, of course, and it depends on preference mainly, provided you're the sole user of the script. If others are going to use it as well, use functions instead, as they can contain a proper help file and calling help functionname will then give you that help.
For instance:
folders1 = dir(../strain/*)
folders2 = dir(../stress/*)
for ii 1 = 1:numel(folders)
operand1 = folders1{ii};
operand2 = folders2{ii};
%... rest of script
%
% Or function:
data = YourFunction(folders1{ii},folders2{ii})
end
So all in all you can use both, although from experience I find functions easier to use in the end, as you just pass parameters and don't need to trawl through the complete code to change the parameters each run.
Additionally you can partition off small parts of your program which do a fix task. If you nest your functions, and finally call just a single function in your scripts, you don't have to look at hundreds of lines of code each time you run the script, but rather can just run a single function (which can also be inside a script or function, ad infinitum).
Finally, a function has its own scope; meaning that any variables that are in that function stay within that function unless you explicitly set them as output (apart from global variables, but those are problematic anyway). This can be a good thing, or a bad thing, depending on the rest of your code. If you function would output ~20 variables for further processing, the function probably should include more steps. It'd be a good thing if you create lots of intermediate variables (I always do), because when the function's finished running, the scope of that function will be removed from memory, saving you clear tmpVar1 tmpVar2 tmpVar3 etc every few lines in your script.
For the script the argument in favour would be that it is easier to debug; you don't need dbstop on error and can step a bit easier through the script, keeping check of all your variables. But, after the debugging has been completed, this argument becomes moot, and thus in general I'd start with writing a script, and once it performs as desired, I rework it to a function at minimal extra effort.
I wanted to ask how values in MATLAB are returned? Are they copied or passed by reference?
take a look at this example with matrix A:
function main
A = foo(10);
return;
end
function [resultMatrix] = foo(count)
resultMatrix = zeros(count, count);
return;
end
Does the copy operation take place when function returns matrix and assigns it to variable A ?
MATLAB uses a system known as copy-on-write in which a copy of the data is only made when it is necessary (i.e. when the data is modified). When returning a variable from a function, it is not modified between when it was created inside of the function and when it was stored in a different variable by the calling function. So in your case, you can think of the variable as being passed by reference. Once the data is modified, however, a copy will be made
You can check this behavior using format debug which will actually tell us the memory location of the data (detailed more in this post)
So if we modify your code slightly so that we print the memory location of each variable we can track when a copy is made
function main()
A = foo(10);
% Print the address of the variable A
fprintf('Address of A in calling function: %s\n', address(A));
% Modify A
B = A + 1;
% Print the address of the variable B
fprintf('Address of B in calling function: %s\n', address(B));
end
function result = foo(count)
result = zeros(count);
% Print the address of the variable inside of the function
fprintf('Address of result in foo: %s\n', address(result));
end
function loc = address(x)
% Store the current display format
fmt = get(0, 'format');
% Turn on debugging display and parse it
format debug
loc = regexp(evalc('disp(x)'), '(?<=pr\s*=\s*)[a-z0-9]*', 'match', 'once');
% Revert the display format to what it was
format(fmt);
end
And this yields the following (or similar) output
Address of result in foo: 7f96d9d591c0
Address of A in calling function: 7f96d9d591c0
Address of B in calling function: 7f96d9c74400
As a side-note, you don't need to explicitly use return in your case since the function will naturally return when it encounters the end. return is only necessary when you need to use it to alter the flow of your program and exit a function pre-maturely.
I have a long list of variables in my workspace.
First, I'm finding the potential variables I could be interested in using the who function. Next, I'd like to loop through this list to find the size of each variable, however who outputs only the name of the variables as a string.
How could I use this list to refer to the values of the variables, rather than just the name?
Thank you,
list = who('*time*')
list =
'time'
'time_1'
'time_2'
for i = 1:size(list,1);
len(i,1) = length(list(i))
end
len =
1
1
1
If you want details about the variables, you can use whos instead which will return a struct that contains (among other things) the dimensions (size) and storage size (bytes).
As far as getting the value, you could use eval but this is not recommended and you should instead consider using cell arrays or structs with dynamic field names rather than dynamic variable names.
S = whos('*time*');
for k = 1:numel(S)
disp(S(k).name)
disp(S(k).bytes)
disp(S(k).size)
% The number of elements
len(k) = prod(S(k).size);
% You CAN get the value this way (not recommended)
value = eval(S(k).name);
end
#Suever nicely explained the straightforward way to get this information. As I noted in a comment, I suggest that you take a step back, and don't generate those dynamically named variables to begin with.
You can access structs dynamically, without having to resort to the slow and unsafe eval:
timestruc.field = time;
timestruc.('field1') = time_1;
fname = 'field2';
timestruc.(fname) = time_2;
The above three assignments are all valid for a struct, and so you can address the fields of a single data struct by generating the field strings dynamically. The only constraint is that field names have to be valid variable names, so the first character of the field has to be a letter.
But here's a quick way out of the trap you got yourself into: save your workspace (well, the relevant part) in a .mat file, and read it back in. You can do this in a way that will give you a struct with fields that are exactly your variable names:
time = 1;
time_1 = 2;
time_2 = rand(4);
save('tmp.mat','time*'); % or just save('tmp.mat')
S = load('tmp.mat');
afterwards S will be a struct, each field will correspond to a variable you saved into 'tmp.mat':
>> S
S =
time: 1
time_1: 2
time_2: [4x4 double]
An example writing variables from workspace to csv files:
clear;
% Writing variables of myfile.mat to csv files
load('myfile.mat');
allvars = who;
for i=1:length(allvars)
varname = strjoin(allvars(i));
evalstr = strcat('csvwrite(', char(39), varname, '.csv', char(39), ', ', varname, ')');
eval(evalstr);
end
I know that, inside a MATLAB function, inputname(k) will return the k-th argument iff the argument is a variable name. Is there any way to write some parsing code that can retrieve the full input argument when that argument is a structure, e.g. foo.bar ? The reason I want to be able to do this is that I'm writing some tools for generic use where the input could be either a named variable or a named structure element.
My primary intent is to be able to store and return the input argment(s) as part of a structure or other variable that the function returns. This is a 'chain of custody' feature which makes it easier for me or others to verify the source data sets used to generate the output data sets.
I don't want the user to have to self-parse externally, or to have to deal with some kludge like
function doit(name,fieldname)
if(exist('fieldname','var'))
name = name.(fieldname);
myinput = [inputname(1),inputname(2)];
else
myinput = inputname(1);
end
% do the function stuff
(I call this a kludge because it both requires the user to enter strange arguments and because it fouls up the argument sequence for functions with multiple inputs)
There is no support from the language to get the input names when passing structs. The reason is probably x.a is internally a call to subsref which returns a new variable, all context is lost. The only possibility you have is using the debug tools and parse the code. There is no other option.
function x=f(varargin)
[ST, I] = dbstack('-completenames', 1);
if numel(ST)>0
fid=fopen(ST(1).file,'r');
for ix=2:ST(1).line;fgetl(fid);end
codeline=fgetl(fid);
fclose(fid);
fprintf('function was called with line %s\n',codeline);
else
fprintf('function was called from base workspace\n');
end
end
From there you may try to parse the code line to get the individual argument names.
Far uglier than Daniel's approach, and probably will crash on the wrong OS, but here's a hack that works to retrieve the first argument; easily adjusted to retrieve all arguments.
[~,myname] = system('whoami');
myname = strtrim(myname(4:end)); % removes domain tag in my Windows envir
% sorry about " \' " fouling up SO's color parsing
myloc = ['C:\Users\' , myname , '\AppData\Roaming\MathWorks\MATLAB\R2015a\History.xml'] ;
f = fopen(myloc,'r');
foo = fscanf(f,'%s');
fclose(f);
pfoo = findpat(foo,'myFunctionName');
% just look for the last instance
namstart = find(foo(pfoo(end):(pfoo(end)+30)) =='(',1) +pfoo(end);
% catch either ')' or ','
namend(1) = find(foo((namstart):end)== ')',1) -2 +namstart;
if numel(find(foo((namstart):end)== ',',1)),
namend(2) = find(foo((namstart):end)== ',',1) -2 +namstart;
end
thearg = foo(namstart:(min(namend)) );
As a follow-up to my previous question about how to assign fields to a structure variable with a dynamic hierarchy, I would now like to be able to query those fields with isfield. However, isfield will only take one argument, not a list as with setfield.
To summarize my problem:
I have a function that organizes data into a structure variable. Depending on certain flags, the data is saved into the substructures with a different number of levels.
For instance, the accepted answer to my previous question has me doing this to build my structure:
foo = struct();
% Pick one...
true_false_statement = true;
% true_false_statement = false;
if true_false_statement
extra_level = {};
else
extra_level = {'baz'};
end
foo = setfield(foo, extra_level{:}, 'bar1', 1);
which gives me foo.bar1 = 1 if true_false_statement is true, and foo.baz.bar1 = 1 otherwise.
Now I want to test for the existence of the field (for instance to pre-allocate an array). If I do this:
if ~isfield(foo, extra_levels{:}, 'bar1')
foo = setfield(foo, extra_level{:}, 'bar1', zeros(1,100));
end
I get an error because isfield will only accept two arguments.
The best I've been able to come up with is to write a separate function with a try...catch block.
function tf = isfield_dyn(structure_variable, intervening_levels, field)
try
getfield(structure_variable, intervening_levels{:}, field);
tf = true;
catch err
if strcmpi(err.identifier, 'MATLAB:nonExistentField')
tf = false;
else
rethrow(err);
end
end
As mentioned below in the comments, this is a hacky hack way to do this, and it doesn't even work all that well.
Is there a more elegant built-in way to do this, or some other more robust way to write a custom function to do this?
You might find the private utility functions getsubfield, setsubfield, rmsubfield, and issubfield from the FieldTrip toolbox very handy. From the documentation of getsubfield:
% GETSUBFIELD returns a field from a structure just like the standard
% GETFIELD function, except that you can also specify nested fields
% using a '.' in the fieldname. The nesting can be arbitrary deep.
%
% Use as
% f = getsubfield(s, 'fieldname')
% or as
% f = getsubfield(s, 'fieldname.subfieldname')
%
% See also GETFIELD, ISSUBFIELD, SETSUBFIELD
I am somewhat confused because
isfield(foo, 'bar1')
isfield(foo, 'baz')
seem to work just fine on your example struct.
Of course, if you want to test more fields, just write a loop over those fieldnames and test them one by one. That may not look vectorized, but is definitely better than abusing a try-catch block to guide your flow.