Dynamically check for existence of structure field name with hierarchy - matlab

As a follow-up to my previous question about how to assign fields to a structure variable with a dynamic hierarchy, I would now like to be able to query those fields with isfield. However, isfield will only take one argument, not a list as with setfield.
To summarize my problem:
I have a function that organizes data into a structure variable. Depending on certain flags, the data is saved into the substructures with a different number of levels.
For instance, the accepted answer to my previous question has me doing this to build my structure:
foo = struct();
% Pick one...
true_false_statement = true;
% true_false_statement = false;
if true_false_statement
extra_level = {};
else
extra_level = {'baz'};
end
foo = setfield(foo, extra_level{:}, 'bar1', 1);
which gives me foo.bar1 = 1 if true_false_statement is true, and foo.baz.bar1 = 1 otherwise.
Now I want to test for the existence of the field (for instance to pre-allocate an array). If I do this:
if ~isfield(foo, extra_levels{:}, 'bar1')
foo = setfield(foo, extra_level{:}, 'bar1', zeros(1,100));
end
I get an error because isfield will only accept two arguments.
The best I've been able to come up with is to write a separate function with a try...catch block.
function tf = isfield_dyn(structure_variable, intervening_levels, field)
try
getfield(structure_variable, intervening_levels{:}, field);
tf = true;
catch err
if strcmpi(err.identifier, 'MATLAB:nonExistentField')
tf = false;
else
rethrow(err);
end
end
As mentioned below in the comments, this is a hacky hack way to do this, and it doesn't even work all that well.
Is there a more elegant built-in way to do this, or some other more robust way to write a custom function to do this?

You might find the private utility functions getsubfield, setsubfield, rmsubfield, and issubfield from the FieldTrip toolbox very handy. From the documentation of getsubfield:
% GETSUBFIELD returns a field from a structure just like the standard
% GETFIELD function, except that you can also specify nested fields
% using a '.' in the fieldname. The nesting can be arbitrary deep.
%
% Use as
% f = getsubfield(s, 'fieldname')
% or as
% f = getsubfield(s, 'fieldname.subfieldname')
%
% See also GETFIELD, ISSUBFIELD, SETSUBFIELD

I am somewhat confused because
isfield(foo, 'bar1')
isfield(foo, 'baz')
seem to work just fine on your example struct.
Of course, if you want to test more fields, just write a loop over those fieldnames and test them one by one. That may not look vectorized, but is definitely better than abusing a try-catch block to guide your flow.

Related

Workaround equivalent of "inputname" to return structure name?

I know that, inside a MATLAB function, inputname(k) will return the k-th argument iff the argument is a variable name. Is there any way to write some parsing code that can retrieve the full input argument when that argument is a structure, e.g. foo.bar ? The reason I want to be able to do this is that I'm writing some tools for generic use where the input could be either a named variable or a named structure element.
My primary intent is to be able to store and return the input argment(s) as part of a structure or other variable that the function returns. This is a 'chain of custody' feature which makes it easier for me or others to verify the source data sets used to generate the output data sets.
I don't want the user to have to self-parse externally, or to have to deal with some kludge like
function doit(name,fieldname)
if(exist('fieldname','var'))
name = name.(fieldname);
myinput = [inputname(1),inputname(2)];
else
myinput = inputname(1);
end
% do the function stuff
(I call this a kludge because it both requires the user to enter strange arguments and because it fouls up the argument sequence for functions with multiple inputs)
There is no support from the language to get the input names when passing structs. The reason is probably x.a is internally a call to subsref which returns a new variable, all context is lost. The only possibility you have is using the debug tools and parse the code. There is no other option.
function x=f(varargin)
[ST, I] = dbstack('-completenames', 1);
if numel(ST)>0
fid=fopen(ST(1).file,'r');
for ix=2:ST(1).line;fgetl(fid);end
codeline=fgetl(fid);
fclose(fid);
fprintf('function was called with line %s\n',codeline);
else
fprintf('function was called from base workspace\n');
end
end
From there you may try to parse the code line to get the individual argument names.
Far uglier than Daniel's approach, and probably will crash on the wrong OS, but here's a hack that works to retrieve the first argument; easily adjusted to retrieve all arguments.
[~,myname] = system('whoami');
myname = strtrim(myname(4:end)); % removes domain tag in my Windows envir
% sorry about " \' " fouling up SO's color parsing
myloc = ['C:\Users\' , myname , '\AppData\Roaming\MathWorks\MATLAB\R2015a\History.xml'] ;
f = fopen(myloc,'r');
foo = fscanf(f,'%s');
fclose(f);
pfoo = findpat(foo,'myFunctionName');
% just look for the last instance
namstart = find(foo(pfoo(end):(pfoo(end)+30)) =='(',1) +pfoo(end);
% catch either ')' or ','
namend(1) = find(foo((namstart):end)== ')',1) -2 +namstart;
if numel(find(foo((namstart):end)== ',',1)),
namend(2) = find(foo((namstart):end)== ',',1) -2 +namstart;
end
thearg = foo(namstart:(min(namend)) );

Matlab coder & dynamic field references

I'm trying to conjure up a little parser that reads a .txt file containing parameters for an algorithm so i don't have to recompile it everytime i change a parameter. The application is C code generated from .m via coder, which unfortunately prohibits me from using a lot of handy matlab gimmicks.
Here's my code so far:
% read textfile
string = readfile(filepath);
% do fancy rearranging
linebreaks = zeros(size(string));
equals = zeros(size(string));
% find delimiters
for n=1:size(string,2)
if strcmp(string(n),char(10))
linebreaks(n) = 1;
elseif strcmp(string(n), '=')
equals(n) = 1;
end
end
% write first key-value pair
idx_s = find(linebreaks);idx_s = [idx_s length(string)];
idx_e = find(equals);
key = string(1:idx_e(1)-1);
value = str2double(string(idx_e(1)+1:idx_s(1)-1));
parameters.(key) = value;
% find number of parameters
count = length(idx_s);
% write remaining key-value pairs
for n=2:count
key = string(idx_s(n-1)+1:idx_e(n)-1);
value = str2double(string(idx_e(n)+1:idx_s(n)-1));
parameters.(key) = value;
end
The problem is that seemingly coder does not support dynamic fieldnames for structures like parameters.(key) = value.
I'm a bit at a loss as to how else i am supposed to come up with a parameter struct that holds all my key-value pairs without hardcoding it. It would somewhat (though not completely) defeat the purpose if the names of keys were not dynamically linked to the parameter file (more manual work if parameters get added/deleted, etc.). If anybody has an idea how to work around this, i'd be very grateful.
As you say, dynamic fieldnames for structures aren't allowed in MATLAB code to be used by Coder. I've faced situations much like yours before, and here's how I handled it.
First, we can list some nice tools that are allowed in Coder. We're allowed to have classes (value or handle), which can be quite handy. Also, we're allowed to have variable sized data if we use coder.varsize to specifically designate it. We also can use string values in switch statements if we like. However, we cannot use coder.varsize for properties in a class, but you can have varsized persistent variables if you like.
What I'd do in your case is create a handle class for storing and retrieving the values. The following example is pretty basic, but will work and could be expanded. If a persistent variable were used in a method, you could even create a varsized allocated storage for the data, but in my example, it's a property and has been limited in the number of values it can store.
classdef keyval < handle %# codegen
%KEYVAL A key and value class designed for Coder
% Stores an arbitrary number of keys and values.
properties (SetAccess = private)
numvals = 0
end
properties (Access = private)
intdata
end
properties (Constant)
maxvals = 100;
maxkeylength = 30;
end
methods
function obj = keyval
%KEYVAL Constructor for keyval class
obj.intdata = repmat(struct('key', char(zeros(1, obj.maxkeylength)), 'val', 0), 1, obj.maxvals);
end
function result = put(obj, key, value)
%PUT Adds a key and value pair into storage
% Result is 0 if successful, 1 on error
result = 0;
if obj.numvals >= obj.maxvals
result = 1;
return;
end
obj.numvals = obj.numvals + 1;
tempstr = char(zeros(1,obj.maxkeylength));
tempstr(1,1:min(end,numel(key))) = key(1:min(end, obj.maxkeylength));
obj.intdata(obj.numvals).key = tempstr;
obj.intdata(obj.numvals).value = value;
end
function keystring = getkeyatindex(obj, index)
%GETKEYATINDEX Get a key name at an index
keystring = deblank(obj.intdata(index).key);
end
function value = getvalueforkey(obj, keyname)
%GETVALUEFORKEY Gets a value associated with a key.
% Returns NaN if not found
value = NaN;
for i=1:obj.numvals
if strcmpi(keyname, deblank(obj.intdata(i).key))
value = obj.intdata(i).value;
end
end
end
end
end
This class implements a simple key/value addition as well as lookup. There are a few things to note about it. First, it's very careful in the assignments to make sure we don't overrun the overall storage. Second, it uses deblank to clear out the trailing zeros that are necessary in the string storage. In this situation, it's not permitted for the strings in the structure to be of different length, so when we put a key string in there, it needs to be exactly the same length with trailing nulls. Deblank cleans this up for the calling function.
The constant properties allocate the total amount of space we're allowed in the storage array. These can be increased, obviously, but not at runtime.
At the MATLAB command prompt, using this class looks like:
>> obj = keyval
obj =
keyval with properties:
numvals: 0
>> obj.put('SomeKeyName', 1.23456)
ans =
0
>> obj
obj =
keyval with properties:
numvals: 1
>> obj.put('AnotherKeyName', 34567)
ans =
0
>> obj
obj =
keyval with properties:
numvals: 2
>> obj.getvalueforkey('SomeKeyName')
ans =
1.2346
>> obj.getkeyatindex(2)
ans =
AnotherKeyName
>> obj.getvalueforkey(obj.getkeyatindex(2))
ans =
34567
If a totally variable storage area is desired, the use of persistent variables with coder.varsize would work, but that will limit the use of this class to a single instance. Persistent variables are nice, but you only get one of them ever. As written, you can use this class in many different places in your program for different storage. If you use a persistent variable, you may only use it once.
If you know some of the key names and are later using them to determine functionality, remember that you can switch on strings in MATLAB, and this works in Coder.

How do I retrieve the names of function parameters in matlab?

Aside from parsing the function file, is there a way to get the names of the input and output arguments to a function in matlab?
For example, given the following function file:
divide.m
function [value, remain] = divide(left, right)
value = floor(left / right);
remain = left / right - value;
end
From outside the function, I want to get an array of output arguments, here: ['value', 'remain'], and similarly for the input arguments: ['left', 'right'].
Is there an easy way to do this in matlab? Matlab usually seems to support reflection pretty well.
EDIT Background:
The aim of this is to present the function parameters in a window for the user to enter. I'm writing a kind of signal processing program, and functions to perform operations on these signals are stored in a subfolder. I already have a list and the names of each function from which the user can select, but some functions require additional arguments (e.g. a smooth function might take window size as a parameter).
At the moment, I can add a new function to the subfolder which the program will find, and the user can select it to perform an operation. What I'm missing is for the user to specify the input and output parameters, and here I've hit the hurdle here in that I can't find the names of the functions.
MATLAB offers a way to get information about class metadata (using the meta package), however this is only available for OOP classes not regular functions.
One trick is to write a class definition on the fly, which contain the source of the function you would like to process, and let MATLAB deal with the parsing of the source code (which can be tricky as you'd imagine: function definition line spans multiple lines, comments before the actual definition, etc...)
So the temporary file created in your case would look like:
classdef SomeTempClassName
methods
function [value, remain] = divide(left, right)
%# ...
end
end
end
which can be then passed to meta.class.fromName to parse for metadata...
Here is a quick-and-dirty implementation of this hack:
function [inputNames,outputNames] = getArgNames(functionFile)
%# get some random file name
fname = tempname;
[~,fname] = fileparts(fname);
%# read input function content as string
str = fileread(which(functionFile));
%# build a class containing that function source, and write it to file
fid = fopen([fname '.m'], 'w');
fprintf(fid, 'classdef %s; methods;\n %s\n end; end', fname, str);
fclose(fid);
%# terminating function definition with an end statement is not
%# always required, but now becomes required with classdef
missingEndErrMsg = 'An END might be missing, possibly matching CLASSDEF.';
c = checkcode([fname '.m']); %# run mlint code analyzer on file
if ismember(missingEndErrMsg,{c.message})
% append "end" keyword to class file
str = fileread([fname '.m']);
fid = fopen([fname '.m'], 'w');
fprintf(fid, '%s \n end', str);
fclose(fid);
end
%# refresh path to force MATLAB to detect new class
rehash
%# introspection (deal with cases of nested/sub-function)
m = meta.class.fromName(fname);
idx = find(ismember({m.MethodList.Name},functionFile));
inputNames = m.MethodList(idx).InputNames;
outputNames = m.MethodList(idx).OutputNames;
%# delete temp file when done
delete([fname '.m'])
end
and simply run as:
>> [in,out] = getArgNames('divide')
in =
'left'
'right'
out =
'value'
'remain'
If your problem is limited to the simple case where you want to parse the function declaration line of a primary function in a file (i.e. you won't be dealing with local functions, nested functions, or anonymous functions), then you can extract the input and output argument names as they appear in the file using some standard string operations and regular expressions. The function declaration line has a standard format, but you have to account for a few variations due to:
Varying amounts of white space or blank lines,
The presence of single-line or block comments, and
Having the declaration broken up on more than one line.
(It turns out that accounting for a block comment was the trickiest part...)
I've put together a function get_arg_names that will handle all the above. If you give it a path to the function file, it will return two cell arrays containing your input and output parameter strings (or empty cell arrays if there are none). Note that functions with variable input or output lists will simply list 'varargin' or 'varargout', respectively, for the variable names. Here's the function:
function [inputNames, outputNames] = get_arg_names(filePath)
% Open the file:
fid = fopen(filePath);
% Skip leading comments and empty lines:
defLine = '';
while all(isspace(defLine))
defLine = strip_comments(fgets(fid));
end
% Collect all lines if the definition is on multiple lines:
index = strfind(defLine, '...');
while ~isempty(index)
defLine = [defLine(1:index-1) strip_comments(fgets(fid))];
index = strfind(defLine, '...');
end
% Close the file:
fclose(fid);
% Create the regular expression to match:
matchStr = '\s*function\s+';
if any(defLine == '=')
matchStr = strcat(matchStr, '\[?(?<outArgs>[\w, ]*)\]?\s*=\s*');
end
matchStr = strcat(matchStr, '\w+\s*\(?(?<inArgs>[\w, ]*)\)?');
% Parse the definition line (case insensitive):
argStruct = regexpi(defLine, matchStr, 'names');
% Format the input argument names:
if isfield(argStruct, 'inArgs') && ~isempty(argStruct.inArgs)
inputNames = strtrim(textscan(argStruct.inArgs, '%s', ...
'Delimiter', ','));
else
inputNames = {};
end
% Format the output argument names:
if isfield(argStruct, 'outArgs') && ~isempty(argStruct.outArgs)
outputNames = strtrim(textscan(argStruct.outArgs, '%s', ...
'Delimiter', ','));
else
outputNames = {};
end
% Nested functions:
function str = strip_comments(str)
if strcmp(strtrim(str), '%{')
strip_comment_block;
str = strip_comments(fgets(fid));
else
str = strtok([' ' str], '%');
end
end
function strip_comment_block
str = strtrim(fgets(fid));
while ~strcmp(str, '%}')
if strcmp(str, '%{')
strip_comment_block;
end
str = strtrim(fgets(fid));
end
end
end
This is going to be very hard (read: impossible) to do for general functions (think of things like varargin, etc). Also, in general, relying on variable names as a form of documentation might be... not what you want. I'm going to suggest a different approach.
Since you control the program, what about specifying each module not just with the m-file, but also with a table entry with extra information. You could document the extra parameters, the function itself, notate when options are booleans and present them as checkboxes, etc.
Now, where to put this? I would suggest to have the main m-file function return the structure, as sort of a module loading step, with a function handle that points to the subfunction (or nested function) that does the real work. This preserves the single-file setup that I'm sure you want to keep, and makes for a much more configurable setup for your modules.
function module = divide_load()
module.fn = #my_divide;
module.name = 'Divide';
module.description = 'Divide two signals';
module.param(1).name = 'left';
module.param(1).description = 'left signal';
module.param(1).required_shape = 'columnvector';
% Etc, etc.
function [value, remain] = my_divide(left, right)
value = floor(left / right);
remain = left / right - value;
end
end
When you can't get information from a programming langauge about its contents (e.g., "reflection"), you have to step outside the language.
Another poster suggested "regular expressions", which always fail when applied to parsing real programs because regexps cannot parse context free langauges.
To do this reliably, you need a real M language parser, that will give you access to the parse tree. Then this is fairly easy.
Our DMS Software Reengineering Toolkit has an M language parser available for it, and could do this.
Have you considered using map containers?
You can write your functions along these lines . . .
function [outMAP] = divide(inMAP)
outMAP = containers.Map();
outMAP('value') = floor(inMAP('left') / inMAP('right'));
outMAP('remain') = inMAP('left') / inMAP('right') - outMAP('value');
end
...and call them like this ...
inMAP = containers.Map({'left', 'right'}, {4, 5});
outMAP = divide(inMAP);
...and then simply examine tha variable names using the following syntax...
>> keys(inMAP)
ans =
'left' 'right'
inputname(argnum) http://www.mathworks.com/help/techdoc/ref/inputname.html .

creating variables from structures in matlab

I have the following example which expresses the type of problem that I'm trying to solve:
clear all
textdata = {'DateTime','St','uSt','Ln','W'};
data = rand(365,4);
Final = struct('data',data,'textdata',{textdata})
clear textdata data
From this, Final.data contains values which correspond to the headings in Final.textdata excluding the first ('DateTime') thus Final.data(:,1) corresponds to the heading 'St'... and so on. What I'm trying to do is to create a variable in the workspace for each of these vectors. So, I would have a variable for St, uSt, Ln, and W in the workspace with the corresponding values given in Final.data.
How could this be done?
Will this solve your problem:
for ii=2:length( textdata )
assignin('base',Final.textdata{ii},Final.data(:,ii-1));
end
Let me know if I misunderstood.
The direct answer to your question is to use the assignin function, like so (edit: just like macduff suggested 10 minutes ago):
%Starting with a Final structure containing the data, like this
Final.textdata = {'DateTime','St','uSt','Ln','W'};
Final.data = rand(365,4);
for ix = 1:4
assignin('base',Final.textdata{ix+1}, Final.data(:,ix));
end
However, I strongly discourage using dynamic variable names to encode data like this. Code that starts this way usually ends up as spaghetti code full of long string concatenations and eval statements. Better is to use a structure, like this
for ix = 1:4
dataValues(Final.textdata{ix+1}) = Final.data(:,ix);
end
Or, to get the same result in a single line:
dataValues = cell2struct(num2cell(Final.data,1), Final.textdata(2:end),2)

Clever way to assign multiple fields at once?

Due to legacy function calls I'm sometimes forced to write ugly wrappers like this
function return = someWrapper(someField)
a = someField.a;
b = someField.b;
% and so on, realistically it's more like ten variables that
% could actually be grouped in a struct
save('params.mat', 'a', 'b'); %etc.
% then, on another machine, a function loads params.mat, does the calculations
% and saves the result in result.mat containing the variables c,d,...
load('result.mat', 'c', 'd');
return.c = c;
return.d = d;
% again, it's more than just two return values
So the basic idea is to create variables with the same names as someField's fieldnames, run a function and create a return structure using someFunction's return variable's names as fieldnames.
Is there some way simplify this using some loop e.g. over fieldnames(someField)?
Or should I actually use some different approach? Since some further processing is done with someField and result I'd like to keep using structs, but maybe a second question would be
Can save and load redirect varibale names? I.e. could e.g. the variable a in params.mat be stored using someField.a as value instead of having to assign a = someField.a first?
Why not something like this?
if this is s:
s.a=1
s.b=2
s.c=3
Then this command creates a matfile named "arguments" with variables a, b, c:
save arguments.mat -struct s
And this command loads a matfiles variables into a structure
r = load('arguments.mat')
How about using ASSIGNIN and dynamic fieldnames to loop over the structure fields and create the appropriate variables in the workspace:
function struct2base(s)
for f = fieldnames(s)'
assignin('base', f{:}, s.(f{:}))
end
Have a look at the deal() function.