How to get a comma-separated list directly from table.Properties.VariableNames? - matlab

Example:
>> A = table({}, {}, {}, {}, {}, ...
'VariableNames', {'Foo', 'Bar', 'Baz', 'Frobozz', 'Quux'});
>> vn = A.Properties.VariableNames;
>> isequal(vn, A.Properties.VariableNames)
ans =
1
So far so good, but even though vn and A.Properties.VariableNames appear to be the same, they behave very differently when one attempts to get a "comma-separated list" from them (using {:}):
>> {'Frobnitz', vn{:}}
ans =
'Frobnitz' 'Foo' 'Bar' 'Baz' 'Frobozz' 'Quux'
>> {'Frobnitz', A.Properties.VariableNames{:}}
ans =
'Frobnitz' 'Foo'
Is there a way to get a "comma-separated list" from A.Properties.VariableNames directly (that is, without having to create an intermediate variable like vn)?
(Also, is there a more reliable function than isequal to test for equality of cell arrays? In the example above vn and A.Properties.VariableNames are clearly not equal enough!)
For those who don't have a version of MATLAB that supports the (rather new) table objects, it's the same story if one uses dataset objects (from the Statistics toolbox) instead. The example above would then translate to:
clear('A', 'vn');
A = dataset({}, {}, {}, {}, {}, ...
'VarNames', {'Foo', 'Bar', 'Baz', 'Frobozz', 'Quux'});
vn = A.Properties.VarNames;
isequal(vn, A.Properties.VarNames)
{'Frobnitz', vn{:}}
{'Frobnitz', A.Properties.VarNames{:}}
(Note the change from VariableNames to VarNames; output omitted: it's identical to the output shown above):

There's no problem with isequal. vn and A.Properties.VariableNames are in fact equal. The problem is something else...
If you type help dataset.subsref, you will get an explanation of why this is happening which should be the same explanation as for the table class:
LIMITATIONS:
Subscripting expressions such as A.CellVar{1:2}, A.StructVar(1:2).field,
or A.Properties.ObsNames{1:2} are valid, but result in subsref
returning multiple outputs in the form of a comma-separated list. If
you explicitly assign to output arguments on the LHS of an assignment,
for example, [cellval1,cellval2] = A.CellVar{1:2}, those variables will
receive the corresponding values. However, if there are no output
arguments, only the first output in the comma-separated list is
returned.
In short, when you invoke the line A.Properties.VarNames{:}, you are making a call to the dataset.subsref method and the curly-brace subscript {:} is being passed to it right along with the other . subscripts, as opposed to being applied separately after the call to the dataset.subsref method.
Because of this, it doesn't look like you can get a comma-separated list directly from A without using an intermediate variable. However, if your goal (as in your example) is to concatenate the strings together with another string into a new cell array, you could do this:
>> [{'Frobnitz'} A.Properties.VarNames]
ans =
'Frobnitz' 'Foo' 'Bar' 'Baz' 'Frobozz' 'Quux'

No, I don't think there is anything you can do except create the temporary variable vn. It's long been a troubling shortcoming of user-defined classes that they cannot do comma separated list expansion. I do find it strange, though, that TMW chose to implement the table class in the user-defined class framework.
As for isequal, there is no issue there. The behavior you see has nothing to do with vn and A.Properties.VariableNames not being equal.

Related

Can I write multiple statements in an anonymous function? [duplicate]

I'd like to do something like this:
>> foo = #() functionCall1() functionCall2()
So that when I said:
>> foo()
It would execute functionCall1() and then execute functionCall2(). (I feel that I need something like the C , operator)
EDIT:
functionCall1 and functionCall2 are not necessarily functions that return values.
Trying to do everything via the command line without saving functions in m-files may be a complicated and messy endeavor, but here's one way I came up with...
First, make your anonymous functions and put their handles in a cell array:
fcn1 = #() ...;
fcn2 = #() ...;
fcn3 = #() ...;
fcnArray = {fcn1 fcn2 fcn3};
...or, if you have functions already defined (like in m-files), place the function handles in a cell array like so:
fcnArray = {#fcn1 #fcn2 #fcn3};
Then you can make a new anonymous function that calls each function in the array using the built-in functions cellfun and feval:
foo = #() cellfun(#feval,fcnArray);
Although funny-looking, it works.
EDIT: If the functions in fcnArray need to be called with input arguments, you would first have to make sure that ALL of the functions in the array require THE SAME number of inputs. In that case, the following example shows how to call the array of functions with one input argument each:
foo = #(x) cellfun(#feval,fcnArray,x);
inArgs = {1 'a' [1 2 3]};
foo(inArgs); %# Passes 1 to fcn1, 'a' to fcn2, and [1 2 3] to fcn3
WORD OF WARNING: The documentation for cellfun states that the order in which the output elements are computed is not specified and should not be relied upon. This means that there are no guarantees that fcn1 gets evaluated before fcn2 or fcn3. If order matters, the above solution shouldn't be used.
The anonymous function syntax in Matlab (like some other languages) only allows a single expression. Furthermore, it has different variable binding semantics (variables which are not in the argument list have their values lexically bound at function creation time, instead of references being bound). This simplicity allows Mathworks to do some optimizations behind the scenes and avoid a lot of messy scoping and object lifetime issues when using them in scripts.
If you are defining this anonymous function within a function (not a script), you can create named inner functions. Inner functions have normal lexical reference binding and allow arbitrary numbers of statements.
function F = createfcn(a,...)
F = #myfunc;
function b = myfunc(...)
a = a+1;
b = a;
end
end
Sometimes you can get away with tricks like gnovice's suggestion.
Be careful about using eval... it's very inefficient (it bypasses the JIT), and Matlab's optimizer can get confused between variables and functions from the outer scope that are used inside the eval expression. It's also hard to debug and/or extent code that uses eval.
Here is a method that will guarantee execution order and, (with modifications mentioned at the end) allows passing different arguments to different functions.
call1 = #(a,b) a();
call12 = #(a,b) call1(b,call1(a,b));
The key is call1 which calls its first argument and ignores its second. call12 calls its first argument and then its second, returning the value from the second. It works because a function cannot be evaluated before its arguments. To create your example, you would write:
foo = #() call12(functionCall1, functionCall2);
Test Code
Here is the test code I used:
>> print1=#()fprintf('1\n');
>> print2=#()fprintf('2\n');
>> call12(print1,print2)
1
2
Calling more functions
To call 3 functions, you could write
call1(print3, call1(print2, call1(print1,print2)));
4 functions:
call1(print4, call1(print3, call1(print2, call1(print1,print2))));
For more functions, continue the nesting pattern.
Passing Arguments
If you need to pass arguments, you can write a version of call1 that takes arguments and then make the obvious modification to call12.
call1arg1 = #(a,arg_a,b) a(arg_a);
call12arg1 = #(a, arg_a, b, arg_b) call1arg1(b, arg_b, call1arg1(a, arg_a, b))
You can also make versions of call1 that take multiple arguments and mix and match them as appropriate.
It is possible, using the curly function which is used to create a comma separated list.
curly = #(x, varargin) x{varargin{:}};
f=#(x)curly({exp(x),log(x)})
[a,b]=f(2)
If functionCall1() and functionCall2() return something and those somethings can be concatenated, then you can do this:
>> foo = #() [functionCall1(), functionCall2()]
or
>> foo = #() [functionCall1(); functionCall2()]
A side effect of this is that foo() will return the concatenation of whatever functionCall1() and functionCall2() return.
I don't know if the execution order of functionCall1() and functionCall2() is guaranteed.

Can an anonymous function in MATLAB have more than one line? [duplicate]

I'd like to do something like this:
>> foo = #() functionCall1() functionCall2()
So that when I said:
>> foo()
It would execute functionCall1() and then execute functionCall2(). (I feel that I need something like the C , operator)
EDIT:
functionCall1 and functionCall2 are not necessarily functions that return values.
Trying to do everything via the command line without saving functions in m-files may be a complicated and messy endeavor, but here's one way I came up with...
First, make your anonymous functions and put their handles in a cell array:
fcn1 = #() ...;
fcn2 = #() ...;
fcn3 = #() ...;
fcnArray = {fcn1 fcn2 fcn3};
...or, if you have functions already defined (like in m-files), place the function handles in a cell array like so:
fcnArray = {#fcn1 #fcn2 #fcn3};
Then you can make a new anonymous function that calls each function in the array using the built-in functions cellfun and feval:
foo = #() cellfun(#feval,fcnArray);
Although funny-looking, it works.
EDIT: If the functions in fcnArray need to be called with input arguments, you would first have to make sure that ALL of the functions in the array require THE SAME number of inputs. In that case, the following example shows how to call the array of functions with one input argument each:
foo = #(x) cellfun(#feval,fcnArray,x);
inArgs = {1 'a' [1 2 3]};
foo(inArgs); %# Passes 1 to fcn1, 'a' to fcn2, and [1 2 3] to fcn3
WORD OF WARNING: The documentation for cellfun states that the order in which the output elements are computed is not specified and should not be relied upon. This means that there are no guarantees that fcn1 gets evaluated before fcn2 or fcn3. If order matters, the above solution shouldn't be used.
The anonymous function syntax in Matlab (like some other languages) only allows a single expression. Furthermore, it has different variable binding semantics (variables which are not in the argument list have their values lexically bound at function creation time, instead of references being bound). This simplicity allows Mathworks to do some optimizations behind the scenes and avoid a lot of messy scoping and object lifetime issues when using them in scripts.
If you are defining this anonymous function within a function (not a script), you can create named inner functions. Inner functions have normal lexical reference binding and allow arbitrary numbers of statements.
function F = createfcn(a,...)
F = #myfunc;
function b = myfunc(...)
a = a+1;
b = a;
end
end
Sometimes you can get away with tricks like gnovice's suggestion.
Be careful about using eval... it's very inefficient (it bypasses the JIT), and Matlab's optimizer can get confused between variables and functions from the outer scope that are used inside the eval expression. It's also hard to debug and/or extent code that uses eval.
Here is a method that will guarantee execution order and, (with modifications mentioned at the end) allows passing different arguments to different functions.
call1 = #(a,b) a();
call12 = #(a,b) call1(b,call1(a,b));
The key is call1 which calls its first argument and ignores its second. call12 calls its first argument and then its second, returning the value from the second. It works because a function cannot be evaluated before its arguments. To create your example, you would write:
foo = #() call12(functionCall1, functionCall2);
Test Code
Here is the test code I used:
>> print1=#()fprintf('1\n');
>> print2=#()fprintf('2\n');
>> call12(print1,print2)
1
2
Calling more functions
To call 3 functions, you could write
call1(print3, call1(print2, call1(print1,print2)));
4 functions:
call1(print4, call1(print3, call1(print2, call1(print1,print2))));
For more functions, continue the nesting pattern.
Passing Arguments
If you need to pass arguments, you can write a version of call1 that takes arguments and then make the obvious modification to call12.
call1arg1 = #(a,arg_a,b) a(arg_a);
call12arg1 = #(a, arg_a, b, arg_b) call1arg1(b, arg_b, call1arg1(a, arg_a, b))
You can also make versions of call1 that take multiple arguments and mix and match them as appropriate.
It is possible, using the curly function which is used to create a comma separated list.
curly = #(x, varargin) x{varargin{:}};
f=#(x)curly({exp(x),log(x)})
[a,b]=f(2)
If functionCall1() and functionCall2() return something and those somethings can be concatenated, then you can do this:
>> foo = #() [functionCall1(), functionCall2()]
or
>> foo = #() [functionCall1(); functionCall2()]
A side effect of this is that foo() will return the concatenation of whatever functionCall1() and functionCall2() return.
I don't know if the execution order of functionCall1() and functionCall2() is guaranteed.

What is 'cat' param used for in TreeBagger method

I am following the tutorial and am trying to implement TreeBagger Method. I have a question since I cannot understand part of the code.
b = TreeBagger(nTrees,X,Y,'oobpred','on','cat',6,'minleaf',leaf(ii));
Can anyone tell me what 'cat' is and the number 6 please?
The constructor for TreeBagger:
% In addition to the optional arguments above, this method accepts all
% optional CLASSREGTREE arguments with the exception of 'minparent'.
% Refer to the documentation for CLASSREGTREE for more detail.
'cat' is not one of the valid input pairs for TreeBagger, so it must be an input for CLASSREGTREE. Looking at the input pairs for classregtree, the only input pair close to 'cat' is 'categorical,' which says:
% 'categorical' Vector of indices of the columns of X that are to be
% treated as unordered categorical variables
If you look at statgetargs.m, specifically this line:
i = strmatch(lower(pname),pnames);
It will allow any arguments as long as the first portion is spelled correctly. pnames will contain a cell array of valid strings (one of them will be 'categorical') while pname will contain a string to compare pnames with (eventually, this will contain 'cat'). If you enter only the first portion of the input string, it will still work. I.e. for me this works:
EDU>> a = TreeBagger(nTrees,X,Y,'oobpr','on','cat',6,'minle',leaf(ii));
EDU>> b = TreeBagger(nTrees,X,Y,'oobpred','on','cat',6,'minleaf',leaf(ii));
EDU>> isequal(a,b)
ans =
1
It doesn't work if 'cat' is changed because it stores 'cat' explicitly as it's spelled under TreeArgs. Regardless, 'cat' is being treated as 'categorical' for classregtree.
cat is being treated as an abbreviation of the categorical input parameter of classregtree, and it specifies that the sixth variable in X should be treated as categorical.

Variable length MATLAB arguments read from variable

I have a function with variable arguments, declared in the standard way:
[] = foo ( varargin )
and I would like to call it from another function, but specify the arguments programmatically. My best attempt is something like the following:
% bar isn't populated like this, but this is how it ends up
bar = { 'var1' 'var2' 'var3' };
foo( bar );
However, bar is put into a 1x1 cell array, and not interpreted as a 1x3 cell array as I intended. I can't change foo, so is there a workaround?
If you have variables a, b, and c that you want to collect together somewhere and ultimately pass to a function as a series of inputs, you can do the following:
inArgs = {a b c}; % Put values in a cell array
foo(inArgs{:});
The syntax inArgs{:} extracts all the values from the cell array as a comma-separated list. The above is therefore equivalent to this:
foo(a,b,c);
If foo is written to accept a variable-length argument list, then the varargin variable will end up being a 1-by-3 cell array where each element stores a separate input argument. Basically, varargin will look exactly like the variable inArgs. If your call to foo didn't use the {:} operator:
foo(inArgs);
then the varargin variable would be a 1-by-1 cell array where the first element is itself the cell array inArgs. In other words, foo would have only 1 input (a 1-by-3 cell array).
The only way that I'm aware of is to use eval, however I don't have MATLAB here, so I can't check the syntax correctly.
If you coerce the bar into a string of the form "'var1', 'var2', 'var3'", you can do:
eval(["foo(", barString, ")"])
Hope that gets you going and sorry it isn't a comprehensive answer.

MATLAB "bug" (or really weird behavior) with structs and empty cell arrays

I have no idea what's going on here. I'm using R2006b. Any chance someone out there with a newer version could test to see if they get the same behavior, before I file a bug report?
code: (bug1.m)
function bug1
S = struct('nothing',{},'something',{});
add_something(S, 'boing'); % does what I expect
add_something(S.something,'test'); % weird behavior
end
function add_something(X,str)
disp('X=');
disp(X);
disp('str=');
disp(str);
end
output:
>> bug1
X=
str=
boing
X=
test
str=
??? Input argument "str" is undefined.
Error in ==> bug1>add_something at 11
disp(str);
Error in ==> bug1 at 4
add_something(S.something,'test');
It looks like the emptiness/nothingness of S.something allows it to shift the arguments for a function call. This seems like Very Bad Behavior. In the short term I want to find away around it (I'm trying to make a function that adds items to an initially empty cell array that's a member of a structure).
Edit:
Corollary question: so there's no way to construct a struct literal containing any empty cell arrays?
As you already discovered yourself, this isn't a bug but a "feature". In other words, it is the normal behavior of the STRUCT function. If you pass empty cell arrays as field values to STRUCT, it assumes you want an empty structure array with the given field names.
>> s=struct('a',{},'b',{})
s =
0x0 struct array with fields:
a
b
To pass an empty cell array as an actual field value, you would do the following:
>> s = struct('a',{{}},'b',{{}})
s =
a: {}
b: {}
Incidentally, any time you want to set a field value to a cell array using STRUCT requires that you encompass it in another cell array. For example, this creates a single structure element with fields that contain a cell array and a vector:
>> s = struct('strings',{{'hello','yes'}},'lengths',[5 3])
s =
strings: {'hello' 'yes'}
lengths: [5 3]
But this creates an array of two structure elements, distributing the cell array but replicating the vector:
>> s = struct('strings',{'hello','yes'},'lengths',[5 3])
s =
1x2 struct array with fields:
strings
lengths
>> s(1)
ans =
strings: 'hello'
lengths: [5 3]
>> s(2)
ans =
strings: 'yes'
lengths: [5 3]
ARGH... I think I found the answer. struct() has multiple behaviors, including:
Note If any of the values fields is
an empty cell array {}, the MATLAB
software creates an empty structure
array in which all fields are also
empty.
and apparently if you pass a member of a 0x0 structure as an argument, it's like some kind of empty phantom that doesn't really show up in the argument list. (that's still probably a bug)
bug2.m:
function bug2(arg1, arg2)
disp(sprintf('number of arguments = %d\narg1 = ', nargin));
disp(arg1);
test case:
>> nothing = struct('something',{})
nothing =
0x0 struct array with fields:
something
>> bug2(nothing,'there')
number of arguments = 2
arg1 =
>> bug2(nothing.something,'there')
number of arguments = 1
arg1 =
there
This behaviour persists in 2008b, and is in fact not really a bug (although i wouldn't say the designers intended for it):
When you step into add_something(S,'boing') and watch the first argument (say by selecting it and pressing F9), you'd get some output relating to the empty structure S.
Step into add_something(S.something,'test') and watch the first argument, and you'd see it's in fact interpreted as 'test' !
The syntax struct.fieldname is designed to return an object of type 'comma separated list'. Functions in matlab are designed to receive an object of this exact type: the argument names are given to the values in the list, in the order they are passed. In your case, since the first argument is an empty list, the comma-separated-list the function receives starts really at the second value you pass - namely, 'test'.
Output is identical in R2008b:
>> bug1
X=
str=
boing
X=
test
str=
??? Input argument "str" is undefined.
Error in ==> bug1>add_something at 11
disp(str);
Error in ==> bug1 at 4
add_something(S.something,'test'); % weird behavior