Performance improvement for multiple uses of polyval function - matlab

I have a simple performance question on using polyval function with Matlab.
Currently, I have a vector of x that can be quite long (>1000 scalars). I want to apply a different polynomial form to each of the x.
The polynomial forms are stored in a 2d array and applied in a loop like the code below. The code is relatively fast as polyval is optimized but the loop can be lengthy and performance is paramount as it is an objective function that can be computed thousands of times in a process.
Any idea on how to improve the performance?
Thanks
% ---------- Objective Function ------------------
function [obj] = obj(x, poly_objective)
polyvalue = zeros(length(x),1);
for index = 1: length(x)
polyvalue (index) = polyval(poly_objective(index,:), x(index));
end
obj= -sum(polyvalue );
end
% -------------------------------------------------

You can linearize your for loop manually, here is an example:
p = [3,2,1;
5,1,3]; %polynomial coeff
x = [5,6].'; %the query points
d = size(p,2)-1:-1:0; %the power factors
res = sum(x.^d.*p,2); %process all the polynome without for loop.
with
res =
86
189
Also if you would evaluate each x value for each polynome you could use:
res = x.^d*p.'; %using only matrix multiplication
with
res =
p1 p2
x1 86 133
x2 121 189

The quickest way is likely to evaluate the different polynomials directly, removing the loop (as shown by obchardon or Luis). However, here's a note on polyval performance...
If you type edit polyval in the command window, you can see the source for the polyval function. In particular there is the following conditional evaluation near the top:
nc = length(p);
if isscalar(x) && (nargin < 3) && nc>0 && isfinite(x) && all(isfinite(p(:)))
% Make it scream for scalar x. Polynomial evaluation can be
% implemented as a recursive digital filter.
y = filter(1,[1 -x],p);
y = y(nc);
return
end
I think the "Make it scream" comment is the developer(s) telling us this is a very quick route through the function! Aside; it's also the best comment I've found in a MATLAB built-in.
So let's try to satisfy the conditions for this if statement...
✓ isscalar(x)
✓ nargin < 3
✓ length(p) > 0
✓ isfinite(x)
✓ all(isfinite(p(:)))
Brilliant, so this is always the evaluation you're using. You might find speed improvements in removing these 5 checks, and simply doing this instead of polyval. In terms of your variables, this looks like so:
y = filter(1,[1 -x(index)],poly_objective(index,:));
polyvalue (index) = y(size(poly_objective,2));
% Note you should get size(poly_objective,2) outside your loop

I find your question a little confusing, but I think this does what you want:
polyvalue = sum(poly_objective .* x(:).^(numel(x)-1:-1:0), 2);
Note that the above uses implicit expansion. For Matlab vesions before R2016b, use bsxfun:
polyvalue = sum(poly_objective .* bsxfun(#power, x(:), (numel(x)-1:-1:0)), 2);
Example
Random data:
>> x = rand(1,4);
>> poly_objective = randi(9,4,4);
Your code:
>> polyvalue = zeros(length(x),1);
for index = 1: length(x)
polyvalue (index) = polyval(poly_objective(index,:), x(index));
end
>> polyvalue
polyvalue =
13.545710504297881
16.286929525147158
13.289183623920710
5.777980886766799
My code:
>> polyvalue = sum(poly_objective .* x(:).^(numel(x)-1:-1:0), 2)
polyvalue =
13.545710504297881
16.286929525147158
13.289183623920710
5.777980886766799

Related

Serious performance issue with iterating simulations

I recently stumbled upon a performance problem while implementing a simulation algorithm. I managed to find the bottleneck function (signally, it's the internal call to arrayfun that slows everything down):
function sim = simulate_frequency(the_f,k,n)
r = rand(1,n); %
x = arrayfun(#(x) find(x <= the_f,1,'first'),r);
sim = (histcounts(x,[1:k Inf]) ./ n).';
end
It is being used in other parts of code as follows:
h0 = zeros(1,sims);
for i = 1:sims
p = simulate_frequency(the_f,k,n);
h0(i) = max(abs(p - the_p));
end
Here are some possible values:
% Test Case 1
sims = 10000;
the_f = [0.3010; 0.4771; 0.6021; 0.6990; 0.7782; 0.8451; 0.9031; 0.9542; 1.0000];
k = 9;
n = 95;
% Test Case 2
sims = 10000;
the_f = [0.0413; 0.0791; 0.1139; 0.1461; 0.1760; 0.2041; 0.2304; 0.2552; 0.2787; 0.3010; 0.3222; 0.3424; 0.3617; 0.3802; 0.3979; 0.4149; 0.4313; 0.4471; 0.4623; 0.4771; 0.4913; 0.5051; 0.5185; 0.5314; 0.5440; 0.5563; 0.5682; 0.5797; 0.5910; 0.6020; 0.6127; 0.6232; 0.6334; 0.6434; 0.6532; 0.6627; 0.6720; 0.6812; 0.6901; 0.6989; 0.7075; 0.7160; 0.7242; 0.7323; 0.7403; 0.7481; 0.7558; 0.7634; 0.7708; 0.7781; 0.7853; 0.7923; 0.7993; 0.8061; 0.8129; 0.8195; 0.8260; 0.8325; 0.8388; 0.8450; 0.8512; 0.8573; 0.8633; 0.8692; 0.8750; 0.8808; 0.8864; 0.8920; 0.8976; 0.9030; 0.9084; 0.9138; 0.9190; 0.9242; 0.9294; 0.9344; 0.9395; 0.9444; 0.9493; 0.9542; 0.9590; 0.9637; 0.9684; 0.9731; 0.9777; 0.9822; 0.9867; 0.9912; 0.9956; 1.000];
k = 90;
n = 95;
The scalar sims must be in the range 1000 1000000. The vector of cumulated frequencies the_f never contains more than 100 elements. The scalar k represents the number of elements in the_f. Finally, the scalar n represents the number of elements in the empirical sample vector, and can even be very large (up to 10000 elements, as far as I can tell).
Any clue about how to improve the computation time of this process?
This seems to be slightly faster for me in the second test case, not the first. The time differences might be larger for longer the_f and larger values of n.
function sim = simulate_frequency(the_f,k,n)
r = rand(1,n); %
[row,col] = find(r <= the_f); % Implicit singleton expansion going on here!
[~,ind] = unique(col,'first');
x = row(ind);
sim = (histcounts(x,[1:k Inf]) ./ n).';
end
I'm using implicit singleton expansion in r <= the_f, use bsxfun if you have an older version of MATLAB (but you know the drill).
Find then returns row and column to all the locations where r is larger than the_f. unique finds the indices into the result for the first element of each column.
Credit: Andrei Bobrov over on MATLAB Answers
Another option (derived from this other answer) is a bit shorter but also a bit more obscure IMO:
mask = r <= the_f;
[x,~] = find(mask & (cumsum(mask,1)==1));
If I want performance, I would avoid arrayfun. Even this for loop is faster:
function sim = simulate_frequency(the_f,k,n)
r = rand(1,n); %
for i = 1:numel(r)
x(i) = find(r(i)<the_f,1,'first');
end
sim = (histcounts(x,[1:k Inf]) ./ n).';
end
Running 10000 sims with the first set of the sample data gives the following timing.
Your arrayfun function:
>Elapsed time is 2.848206 seconds.
The for loop function:
>Elapsed time is 0.938479 seconds.
Inspired by Cris Luengo's answer, I suggest below:
function sim = simulate_frequency(the_f,k,n)
r = rand(1,n); %
x = sum(r > the_f)+1;
sim = (histcounts(x,[1:k Inf]) ./ n)';
end
Time:
>Elapsed time is 0.264146 seconds.
You can use histcounts with r as its input:
r = rand(1,n);
sim = (histcounts(r,[-inf ;the_f]) ./ n).';
If histc is used instead of histcounts the whole simulation can be vectorized:
r = rand(n,sims);
p = histc(r, [-inf; the_f],1);
p = [p(1:end-2,:) ;sum(p(end-1:end,:))]./n;
h0 = max(abs(p-the_p(:))); %h0 = max(abs(bsxfun(#minus,p,the_p(:))));

Creating summation series functions in Matlab with variables for optimization

I have a dataset with 1125 rows and 64 columns. Where first 554 rows belong to one class and the remaining rows belong to the other class. The objective function
is to be minimized in terms of R_1 and R_2 where both are are row vectors(1 x 64). x_i and x_l are the rows from the data matrix. I am trying to minimize this objective function using the optimization toolbox, but I am struggling to get the objective function in the desired form and running into errors. This is how I have coded so far
data = xlsread('data.xlsx');
dat1 = data(1:554,:);
dat2 = data(555:1125,:);
f1 = #(x) 0;
f2 = #(x) 0;
%% for digits labeled 0
for i = 1:554
f1 = #(x) f1 + (dat1(i,:) - x(1)).^2;
end
%% for digits labeled 1
for j = 1:571
f2 = #(x) f2 + (dat2(j,:) - x(2)).^2;
end
%% final objective function
f = #(x) 1/554*f1 + 1/571*f2;
%%
x = fminunc(f);
Please guide me on how to correctly form this type of objective function in Matlab
None of your code makes sense. A few issues
f1 = #(x) 0; and f2 = #(x) 0 define anonymous functions which always return zero. What is the purpose of this?
Every further definition of f1,f2,f is attempting to do arithmetic operations on an anonymous function. It's not clear what you expect this to accomplish.
x = fminunc(f); is missing an argument, it needs an initial guess as well. This isn't just to initialize the algorithm but also so that fminunc knows the dimensions that the input to f should have.
For your case f should be defined so half the values passed to it refer to R1 and the other half refer to R2. For example define
l2_sq = #(x) sum(x.^2,2); % return norm(x,2)^2 for each row of x
f1 = #(R1) sum(l2_sq(bsxfun(#minus, dat1, R1)));
f2 = #(R2) sum(l2_sq(bsxfun(#minus, dat2, R2)));
f3 = #(R1,R2) -10 * norm(R1-R2,1);
f = #(R) f1(R(1:64)) + f2(R(65:128)) + f3(R(1:64), R(65:128));
Since the combined R vector has 128 elements, we need to generate an initial guess that contains 128 elements. In this case we could just use random Gaussian noise
R0 = randn(1,128);
Finally, call fminunc as
Rhat = fminunc(f, R0);
R1 = Rhat(1:64);
R2 = Rhat(65:128);
where R1 and R2 are the optimal values.
Note In MATLAB 2016b and newer, implicit expansion allows you to replace bsxfun(#minus, dat1, R1) with the more efficient dat1 - R1. Similarly for bsxfun(#minus, dat2, R2).

Custom-made linspace and logspace in MATLAB

I decided to take a look at two functions linspace and logspace. Below I give two examples, one using MATLAB's built-in linspace and one for logspace along with their hand made implementation. In the first case both the built-in function linspace and the handmade code give the same results. However, this is not true when examining the logspace function. Could you please help me to found the error in the handmade code?
a = 1; b = 5; n = 7;
y = linspace(1,5,7);
yy = zeros(1,n); yy(1) = a;
for i=2:n
yy(i) = yy(i-1) + (b-a)/(n-1);
end
x = logspace(1,5,7);
xx = zeros(1,n); xx(1) = 10^a;
for i=2:n
xx(i) = xx(i-1) + (10^b-10^a)/(n-1);
end
Thank you!
The only difference between linspace and logspace is that they go one step further and take the power of 10 for every element in the linspace array.
As such, you'd simply take your equation for linspace you generated, take the result and raise it to the power of 10. However, with your code, you are relying on the previous result and that is already raised to the power of 10. Therefore, you'll need to take the anti-log to convert the previous result back to a linear form, then use the same logic was used to generate the linspace, then raise it back to the power of 10. Therefore, the relationship is:
xx[n] = 10^(log10(xx[n-1]) + ((b-a)/(n-1)))
You can certainly simplify this, taking advantage of the fact that 10^(log10(z)) = z, as long as z > 0. We can also split up the terms in the power using the property that 10^(m + n) = (10^m) * (10^n). Therefore:
xx[n] = xx[n-1] * (10^((b-a)/(n-1)))
As such, simply take your previous result multiply with 10^((b-a)/(n-1))
a = 1; b = 5; n = 7;
x = logspace(1,5,7);
xx = zeros(1,n); xx(1) = 10^a;
for i=2:n
xx(i) = xx(i-1)*(10^((b-a)/(n-1))); %// Change
end
We get for both x and xx:
>> format long g;
>> x
x =
Columns 1 through 4
10 46.4158883361278 215.443469003188 1000
Columns 5 through 7
4641.58883361278 21544.3469003189 100000
>> xx
xx =
Columns 1 through 4
10 46.4158883361278 215.443469003188 1000
Columns 5 through 7
4641.58883361278 21544.3469003188 100000

Efficient way to apply arrayfun to a matrix (i.e. R^N to R^M)

I have a function that transforms R^N to R^M. For simplicity, lets just let it be the identity function #(z) z where z may be a vector. I want to apply a function to a list of parameters of size K x N and have it map to K x M output.
Here is my attempt:
function out_matrix = array_fun_matrix(f, vals)
for i=1:size(vals,1)
f_val = f(vals(i,:));
if(size(f_val,1) > 1) %Need to stack up rows, so convert as required.
f_val = f_val';
end
out_matrix(i,:) = f_val;
end
end
You can try it with
array_fun_matrix(#(z) z(1)^2 + z(2)^2 + z(3), [0 1 0; 1 1 1; 1 2 1; 2 2 2])
The question: Is there a better and more efficient way to do this with vectorization, etc.? Did I miss a built-in function?
Examples of non-vectorizable functions: There are many, usually involving elaborate sub-steps and numerical solutions. A trivial example is something like looking for the numerical solution to an equation, which in term is using numerical quadrature. i.e. let params = [b c] and solve for the a such that int_0^a ((z + b)^2) dz = c
(I know here you could do some calculus, but the integral here is stripped down). Implementing this example,
find_equilibrium = #(param) fzero(#(a) integral(#(x) (x + param(1)).^2 - param(2), 0, a), 1)
array_fun_matrix(find_equilibrium, [0 1; 0 .8])
You can use the cellfun function, but you'll need to manipulate your data a bit:
function out_matrix = array_fun_matrix(f, vals)
% Convert your data to a cell array:
cellVals = mat2cell(vals, ones(1,size(vals,1)));
% apply the function:
out_cellArray = cellfun(f, cellVals, 'UniformOutput', false);
% Convert back to matrix:
out_matrix = cell2mat(out_cellArray);
end
If you don't like this implementation, you can improve the performance of yours by preallocating the out_matrix:
function out_matrix = array_fun_matrix(f, vals)
firstOutput = f(vals(1,:));
out_matrix = zeros(size(vals,1), length(firstOutput)); % preallocate for speed.
for i=1:size(vals,1)
f_val = f(vals(i,:));
if(size(f_val,1) > 1) %Need to stack up rows, so convert as required.
f_val = f_val';
end
out_matrix(i,:) = f_val;
end
end

Get binomial coefficients

In an attempt to vectorize a particular piece of Matlab code, I could not find a straightforward function to generate a list of the binomial coefficients. The best I could find was nchoosek, but for some inexplicable reason this function only accepts integers (not vectors of integers). My current solution looks like this:
mybinom = #(n) arrayfun(#nchoosek, n*ones(1,n), 1:n)
This generates the set of binomial coefficients for a given value of n. However, since the binomial coefficients are always symmetric, I know that I am doing twice as much work as necessary. I'm sure that I could create a solution that exploits the symmetry, but I'm sure that it would be at the expense of readability.
Is there a more elegant solution than this, perhaps using a Matlab function that I am not aware of? Note that I am not interested in using the symbolic toolbox.
If you want to minimize operations you can go along these lines:
n = 6;
k = 1:n;
result = [1 cumprod((n-k+1)./k)]
>> result
result =
1 6 15 20 15 6 1
This requires very few operations per coefficient, because each cofficient is obtained exploiting the previously computed one.
You can reduce the number of operations by approximately half if you take into account the symmetry:
m1 = floor(n/2);
m2 = ceil(n/2);
k = 1:m2;
result = [1 cumprod((n-k+1)./k)];
result(n+1:-1:m1+2) = result(1:m2);
What about a modified version of Luis Mendo's solution - but in logarithms:
n = 1e4;
m1 = floor(n/2);
m2 = ceil(n/2);
k = 1:m2;
% Attempt to compute real value
out0 = [1 cumprod((n-k+1)./k)];
out0(n+1:-1:m1+2) = out0(1:m2);
% In logarithms
out1 = [0 cumsum((log(n-k+1)) - log(k))];
out1(n+1:-1:m1+2) = out1(1:m2);
plot(log(out0) - out1, 'o-')
The advantage of working with logarithms is that you can set n = 1e4; and still obtain a good approximation of the real value (nchoosek(1e4, 5e3) returns Inf and this is not a good approximation at all!).
EDIT following horchler's comment
You can use the gammaln function to obtain the same result but it's not faster. The two approximations seem to be quite different:
n = 1e7;
m1 = floor(n/2);
m2 = ceil(n/2);
k = 1:m2;
% In logarithms
tic
out1 = [0 cumsum((log(n-k+1)) - log(k))];
out1(n+1:-1:m1+2) = out1(1:m2);
toc
% Elapsed time is 0.912649 seconds.
tic
k = 0:m2;
out2 = gammaln(n + 1) - gammaln(k + 1) - gammaln(n - k + 1);
out2(n+1:-1:m1+2) = out2(1:m2);
toc
% Elapsed time is 1.020188 seconds.
tmp = out2 - out1;
plot(tmp, '.')
prctile(tmp, [0 2.5 25 50 75 97.5 100])
% 1.0e-006 *
% -0.2217 -0.1462 -0.0373 0.0363 0.1225 0.2943 0.3846
Is adding three gammaln worse than adding n logarithms? Or viceversa?
This works for Octave only
You can use bincoeff function.
Example: bincoeff(5, 0:5)
EDIT :
Only improvement I can think of goes like this. Maybe you already thought this trivial solution and didn't like it.
# Calculate only the first half
mybinomhalf = #(n) arrayfun(#nchoosek, n*ones(1,n/2+1), 0:n/2)
# pad your array symmetrically
mybinom = #(n) padarray(mybinomhalf(n), [0 n/2], 'symmetric', 'post')
# I couldn't test it and this line may not work