Call multiple functions from cells in MATLAB - matlab

I store some functions in cell, e.g. f = {#sin, #cos, #(x)x+4}.
Is it possible to call all those functions at the same time (with the same input). I mean something more efficient than using a loop.

As constructed, the *fun family of functions exists for this purpose (e.g., cellfun is the pertinent one here). They are other questions on the use and performance of these functions.
However, if you construct f as a function that constructs a cell array as
f = #(x) {sin(x), cos(x), x+4};
then you can call the function more naturally: f([1,2,3]) for example.
This method also avoids the need for the ('UniformOutput',false) option pair needed by cellfun for non-scalar argument.
You can also use regular double arrays, but then you need to be wary of input shape for concatenation purposes: #(x) [sin(x), cos(x), x+4] vs. #(x) [sin(x); cos(x); x+4].

I'm just posting these benchmarking results here, just to illustrate that loops not necessarily are slower than other approaches:
f = {#sin, #cos, #(x)x+4};
x = 1:100;
tic
for ii = 1:1000
for jj = 1:numel(f)
res{jj} = f{jj}(x);
end
end
toc
tic
for ii = 1:1000
res = cellfun(#(arg) arg(x),functions,'uni',0);
end
toc
Elapsed time is 0.042201 seconds.
Elapsed time is 0.179229 seconds.
Troy's answer is almost twice as fast as the loop approach:
tic
for ii = 1:1000
res = f((1:100).');
end
toc
Elapsed time is 0.025378 seconds.

This might do the trick
functions = {#(arg) sin(arg),#(arg) sqrt(arg)}
x = 5;
cellfun(#(arg) arg(x),functions)
hope this helps.
Adrien.

Related

MATLAB cellfun vectorization slow when using function handle

I encountered a weird bug in cell vectorization (MATLAB version R2019B).
Please consider the following minimal example, say we generate a cell array with variable length vector in each cell:
N = 10000;
rng(1);
result = cell(N,1);
numConnect = randi(10, [N,1]); % randomly generated number of connected nodes
for i = 1:N
result{i} = randi(N, [1, numConnect(i)]);
end
Now we want to retrospectively retrieve numConnect, i.e., the length of each cell, we can use cellfun. According to this documentation, in Backward Compatibility mode, you can use string as func variable instead of function handle. However, there is a drastic difference in performance locally.
tic;
nC1 = cellfun('length', result);
toc;
This one usually produces something like
Elapsed time is 0.038531 seconds.
If I changed to # function handle:
tic;
nC2 = cellfun(#length, result);
toc;
Then
Elapsed time is 1.041925 seconds.
is normal. There is a 30x difference!
I wonder is this performance difference a bug on my local machine, or a "feature" of MATLAB cellfun?

Quickly Evaluating MANY matlabFunctions

This post builds on my post about quickly evaluating analytic Jacobian in Matlab:
fast evaluation of analytical jacobian in MATLAB
The key difference is that now, I am working with the Hessian and I have to evaluate close to 700 matlabFunctions (instead of 1 matlabFunction, like I did for the Jacobian) each time the hessian is evaluated. So there is an opportunity to do things a little differently.
I have tried to do this two ways so far and I am thinking about implementing a third and was wondering if anyone has any other suggestions. I will go through each method with a toy example, but first some preprocessing to generate these matlabFunctions:
PreProcessing:
% This part of the code is calculated once, it is not the issue
dvs = 5;
X=sym('X',[dvs,1]);
num = dvs - 1; % number of constraints
% multiple functions
for k = 1:num
f1(X(k+1),X(k)) = (X(k+1)^3 - X(k)^2*k^2);
c(k) = f1;
end
gradc = jacobian(c,X).'; % .' performs transpose
parfor k = 1:num
hessc{k} = jacobian(gradc(:,k),X);
end
parfor k = 1:num
hess_name = strcat('hessian_',num2str(k));
matlabFunction(hessc{k},'file',hess_name,'vars',X);
end
METHOD #1 : Evaluate functions in series
%% Now we use the functions to run an "optimization." Just for an example the "optimization" is just a for loop
fprintf('This is test A, where the functions are evaluated in series!\n');
tic
for q = 1:10
x_dv = rand(dvs,1); % these are the design variables
lambda = rand(num,1); % these are the lagrange multipliers
x_dv_cell = num2cell(x_dv); % for passing large design variables
for k = 1:num
hess_name = strcat('hessian_',num2str(k));
function_handle = str2func(hess_name);
H_temp(:,:,k) = lambda(k)*function_handle(x_dv_cell{:});
end
H = sum(H_temp,3);
end
fprintf('The time for test A was:\n')
toc
METHOD # 2: Evaluate functions in parallel
%% Try to run a parfor loop
fprintf('This is test B, where the functions are evaluated in parallel!\n');
tic
for q = 1:10
x_dv = rand(dvs,1); % these are the design variables
lambda = rand(num,1); % these are the lagrange multipliers
x_dv_cell = num2cell(x_dv); % for passing large design variables
parfor k = 1:num
hess_name = strcat('hessian_',num2str(k));
function_handle = str2func(hess_name);
H_temp(:,:,k) = lambda(k)*function_handle(x_dv_cell{:});
end
H = sum(H_temp,3);
end
fprintf('The time for test B was:\n')
toc
RESULTS:
METHOD #1 = 0.008691 seconds
METHOD #2 = 0.464786 seconds
DISCUSSION of RESULTS
This result makes sense because, the functions evaluate very quickly and running them in parallel waists a lot of time setting up and sending out the jobs to the different Matlabs ( and then getting the data back from them). I see the same result on my actual problem.
METHOD # 3: Evaluating the functions using the GPU
I have not tried this yet, but I am interested to see what the performance difference is. I am not yet familiar with doing this in Matlab and will add it once I am done.
Any other thoughts? Comments? Thanks!

Initialize vector with function in matlab

just started out with matlab and have some troubles finding the solution for the following action:
I am trying to initialize a vector of 1000 different values, with a function that doesn't take any arguments as input. I can do this with a for loop, but haven't found out how to do it without.
What I expected that would work:
z = zeros(1,1000)
result = arrayfun(*functionname*,z)
This however gives an error saying that the first input must be a function handle.
My function is a simple implementation of a monte carlo method to calculate pi:
function Result = mcm()
clear
N=1000;
M=0;
for j=1:N
p=[2*rand-1; 2*rand-1];
if p'*p<1
M=M+1;
end
end
Result=4*M/N
One way to actually vectorize your given function mcm would be -
N = 1000; %// Number of data points
P = [2*rand(1,N)-1; 2*rand(1,N)-1]; %// OR 2*rand(2,N)-1
out = 4*sum(sum(P.^2,1)<1)/N
Runtime tests
Code -
N = 1000000; %// Number of data points
disp('---------------- With Original Approach')
tic
M=0;
for j=1:N
P=[2*rand-1; 2*rand-1];
if P'*P<1
M=M+1;
end
end
Result=4*M/N;
toc
disp('---------------- With Proposed Approach')
tic
P = 2*rand(2,N)-1;
out = 4*sum(sum(P.^2,1)<1)/N;
toc
Timings & Outputs -
---------------- With Original Approach
Elapsed time is 3.952998 seconds.
---------------- With Proposed Approach
Elapsed time is 0.089590 seconds.
>> Result
Result =
3.1422
>> out
out =
3.1428
Since your function takes no arguments you can't use arrayfun. arrayfun applies the function to each element in the array.
Instead use this:
z = ones(1,1000) * mcm;
A side benefit is that mcm will only run once so it will be faster than looping that function 1000 times.

Octave/Matlab: Adding new elements to a vector

Having a vector x and I have to add an element (newElem) .
Is there any difference between -
x(end+1) = newElem;
and
x = [x newElem];
?
x(end+1) = newElem is a bit more robust.
x = [x newElem] will only work if x is a row-vector, if it is a column vector x = [x; newElem] should be used. x(end+1) = newElem, however, works for both row- and column-vectors.
In general though, growing vectors should be avoided. If you do this a lot, it might bring your code down to a crawl. Think about it: growing an array involves allocating new space, copying everything over, adding the new element, and cleaning up the old mess...Quite a waste of time if you knew the correct size beforehand :)
Just to add to #ThijsW's answer, there is a significant speed advantage to the first method over the concatenation method:
big = 1e5;
tic;
x = rand(big,1);
toc
x = zeros(big,1);
tic;
for ii = 1:big
x(ii) = rand;
end
toc
x = [];
tic;
for ii = 1:big
x(end+1) = rand;
end;
toc
x = [];
tic;
for ii = 1:big
x = [x rand];
end;
toc
Elapsed time is 0.004611 seconds.
Elapsed time is 0.016448 seconds.
Elapsed time is 0.034107 seconds.
Elapsed time is 12.341434 seconds.
I got these times running in 2012b however when I ran the same code on the same computer in matlab 2010a I get
Elapsed time is 0.003044 seconds.
Elapsed time is 0.009947 seconds.
Elapsed time is 12.013875 seconds.
Elapsed time is 12.165593 seconds.
So I guess the speed advantage only applies to more recent versions of Matlab
As mentioned before, the use of x(end+1) = newElem has the advantage that it allows you to concatenate your vector with a scalar, regardless of whether your vector is transposed or not. Therefore it is more robust for adding scalars.
However, what should not be forgotten is that x = [x newElem] will also work when you try to add multiple elements at once. Furthermore, this generalizes a bit more naturally to the case where you want to concatenate matrices. M = [M M1 M2 M3]
All in all, if you want a solution that allows you to concatenate your existing vector x with newElem that may or may not be a scalar, this should do the trick:
x(end+(1:numel(newElem)))=newElem

why arrayfun does NOT improve my struct array operation performance

here is the input data:
% #param Landmarks:
% Landmarks should be 1*m struct.
% m is the number of training set.
% Landmark(i).data is a n*2 matrix
old function:
function Landmarks=CenterOfGravity(Landmarks)
% align center of gravity
for i=1 : length(Landmarks)
Landmarks(i).data=Landmarks(i).data - ones(size(Landmarks(i).data,1),1)...
*mean(Landmarks(i).data);
end
end
new function which use arrayfun:
function [Landmarks] = center_to_gravity(Landmarks)
Landmarks = arrayfun(#(struct_data)...
struct('data', struct_data.data - repmat(mean(struct_data.data), [size(struct_data.data, 1), 1]))...
,Landmarks);
end %function center_to_gravity
when using profiler, I find the usage of time is NOT what I expected:
Function Total Time Self Time*
CenterOfGravity 0.011s 0.004 s
center_to_gravity 0.029s 0.001 s
Can someone tell me why?
BTW...I can't add "arrayfun" as a new tag for my reputation.
Using arrayfun does not count as "vectorizing your code" as described in every Matlab performance blog post ever written.
If your .data field is the same length for all entries of landmark, your could vectorize this code by first placing all of the data into a single DATASIZE-BY-LANDMARKSIZE martix, and then running this command
meanRemovedData = bsxfun(#minus, data, mean(data,1));
But you lose an awful lot of code clarity that way. (I'm pretty sure that bsxfun usually has vectorization-like speed advantages, but I haven't done any time testing this morning.)
In terms of why, I'm not really the right guy to ask. But many of the advantages of vectorization are dependent on performing simple operations of contiguous blocks of memory. Data stored in an array of structures is (I believe) stored as an array of pointers to disparate memory locations, which is why you can change the size or class of Landmarks(i).data without reallocating the whole structure array.
Thanks for Amro and Pursuit's enthusiastic to my question.
I get the best solution at Matlab answers from Jan Simon:
why arrayfun does NOT improve my struct array operation performance
There are some points that do improve the performance:
It is surprisingly that SUM/LENGTH is faster than MEAN
timeit can give more accurate result.
The fastest approach use tricks like this:
m = sum(data, 1) / size(data, 1);
data(:, 1) = data(:, 1) - m(1);
Consider the following three implementations (all vectorized using BSXFUN):
function s = func1(s)
for i=1:numel(s)
s(i).data = bsxfun(#minus, s(i).data, mean(s(i).data));
end
end
function v = func2(s)
v = arrayfun(#(ss) bsxfun(#minus,ss.data,mean(ss.data)), ...
s, 'UniformOutput',false);
v = struct('data',v);
end
function v = func3(s)
v = arrayfun(#(ss) struct('data',bsxfun(#minus,ss.data,mean(ss.data))), ...
s, 'UniformOutput',true);
end
Explanation:
First uses a for-loop to iterate over the array of structs.
Second uses ARRAYFUN to return a cell array of the data matrices, which are then passed to STRUCT to build the array of structures.
The last one uses ARRAYFUN and builds a structure directly at each iteration.
Here is a simple test to compare the timings:
function testArrayStruct()
%# sample array of structures
s = struct('data',[]);
for i=5000:-1:1
s(i).data = rand(randi(1000),2);
end
%# timing
tic; v1 = func1(s); toc
tic; v2 = func2(s); toc
tic; v3 = func3(s); toc
%# check all have the same output
assert(isequal(v1,v2,v3))
end
The results:
Elapsed time is 0.357796 seconds. %# func1
Elapsed time is 0.427568 seconds. %# func2
Elapsed time is 0.537971 seconds. %# func3
So you can see the loop-based solution is actually the fastest..