vectorize lookup values in table of interval limits - matlab

Here is a question about whether we can use vectorization type of operation in matlab to avoid writing for loop.
I have a vector
Q = [0.1,0.3,0.6,1.0]
I generate a uniformly distributed random vector over [0,1)
X = [0.11,0.72,0.32,0.94]
I want to know whether each entry of X is between [0,0.1) or [0.1,0.3) or [0.3,0.6), or [0.6,1.0) and I want to return a vector which contains the index of the maximum element in Q that each entry of X is less than.
I could write a for loop
Y = zeros(length(X),1)
for i = 1:1:length(X)
Y(i) = find(X(i)<Q, 1);
end
Expected result for this example:
Y = [2,4,3,4]
But I wonder if there is a way to avoid writing for loop? (I see many very good answers to my question. Thank you so much! Now if we go one step further, what if my Q is a matrix, such that I want check whether )
Y = zeros(length(X),1)
for i = 1:1:length(X)
Y(i) = find(X(i)<Q(i), 1);
end

Use the second output of max, which acts as a sort of "vectorized find":
[~, Y] = max(bsxfun(#lt, X(:).', Q(:)), [], 1);
How this works:
For each element of X, test if it is less than each element of Q. This is done with bsxfun(#lt, X(:).', Q(:)). Note each column in the result corresponds to an element of X, and each row to an element of Q.
Then, for each element of X, get the index of the first element of Q for which that comparison is true. This is done with [~, Y] = max(..., [], 1). Note that the second output of max returns the index of the first maximizer (along the specified dimension), so in this case it gives the index of the first true in each column.
For your example values,
Q = [0.1, 0.3, 0.6, 1.0];
X = [0.11, 0.72, 0.32, 0.94];
[~, Y] = max(bsxfun(#lt, X(:).', Q(:)), [], 1);
gives
Y =
2 4 3 4

Using bsxfun will help accomplish this. You'll need to read about it. I also added a Q = 0 at the beginning to handle the small X case
X = [0.11,0.72,0.32,0.94 0.01];
Q = [0.1,0.3,0.6,1.0];
Q_extra = [0 Q];
Diff = bsxfun(#minus,X(:)',Q_extra (:)); %vectorized subtraction
logical_matrix = diff(Diff < 0); %find the transition from neg to positive
[X_categories,~] = find(logical_matrix == true); % get indices
% output is 2 4 3 4 1
EDIT: How long does each method take?
I got curious about the difference between each solution:
Test Code Below:
Q = [0,0.1,0.3,0.6,1.0];
X = rand(1,1e3);
tic
Y = zeros(length(X),1);
for i = 1:1:length(X)
Y(i) = find(X(i)<Q, 1);
end
toc
tic
result = arrayfun(#(x)find(x < Q, 1), X);
toc
tic
Q = [0 Q];
Diff = bsxfun(#minus,X(:)',Q(:)); %vectorized subtraction
logical_matrix = diff(Diff < 0); %find the transition from neg to positive
[X_categories,~] = find(logical_matrix == true); % get indices
toc
Run it for yourself, I found that when the size of X was 1e6, bsxfun was much faster, while for smaller arrays the differences were varying and negligible.
SAMPLE: when size X was 1e3
Elapsed time is 0.001582 seconds. % for loop
Elapsed time is 0.007324 seconds. % anonymous function
Elapsed time is 0.000785 seconds. % bsxfun

Octave has a function lookup to do exactly that. It takes a lookup table of sorted values and an array, and returns an array with indices for values in the lookup table.
octave> Q = [0.1 0.3 0.6 1.0];
octave> x = [0.11 0.72 0.32 0.94];
octave> lookup (Q, X)
ans =
1 3 2 3
The only issue is that your lookup table has an implicit zero which be fixed easily with:
octave> lookup ([0 Q], X) # alternatively, just add 1 at the results
ans =
2 4 3 4

You can create an anonymous function to perform the comparison, then apply it to each member of X using arrayfun:
compareFunc = #(x)find(x < Q, 1);
result = arrayfun(compareFunc, X, 'UniformOutput', 1);
The Q array will be stored in the anonymous function ( compareFunc ) when the anonymous function is created.
Or, as one line (Uniform Output is the default behavior of arrayfun):
result = arrayfun(#(x)find(x < Q, 1), X);

Octave does a neat auto-vectorization trick for you if the vectors you have are along different dimensions. If you make Q a column vector, you can do this:
X = [0.11, 0.72, 0.32, 0.94];
Q = [0.1; 0.3; 0.6; 1.0; 2.0; 3.0];
X <= Q
The result is a 6x4 matrix indicating which elements of Q each element of X is less than. I made Q a different length than X just to illustrate this:
0 0 0 0
1 0 0 0
1 0 1 0
1 1 1 1
1 1 1 1
1 1 1 1
Going back to the original example you have, you can do
length(Q) - sum(X <= Q) + 1
to get
2 4 3 4
Notice that I have semicolons instead of commas in the definition of Q. If you want to make it a column vector after defining it, do something like this instead:
length(Q) - sum(X <= Q') + 1
The reason that this works is that Octave implicitly applies bsxfun to an operation on a row and column vector. MATLAB will not do this until R2016b according to #excaza's comment, so in MATLAB you can do this:
length(Q) - sum(bsxfun(#le, X, Q)) + 1
You can play around with this example in IDEOne here.

Inspired by the solution posted by #Mad Physicist, here is my solution.
Q = [0.1,0.3,0.6,1.0]
X = [0.11,0.72,0.32,0.94]
Temp = repmat(X',1,4)<repmat(Q,4,1)
[~, ind]= max( Temp~=0, [], 2 );
The idea is that make the X and Q into the "same shape", then use element wise comparison, then we obtain a logical matrix whose row tells whether a given element in X is less than each of the element in Q, then return the first non-zero index of each row of this logical matrix. I haven't tested how fast this method is comparing to other methods

Related

How to programmatically compute this summation

I want to compute the above summation, for a given 'x'. The summation is to be carried out over a block of lengths specified by an array , for example block_length = [5 4 3]. The summation is carried as follows: from -5 to 5 across one dimension, -4 to 4 in the second dimension and -3 to 3 in the last dimension.
The pseudo code will be something like this:
sum = 0;
for i = -5:5
for j = -4:4
for k = -3:3
vec = [i j k];
tv = vec * vec';
sum = sum + 1/(1+tv)*cos(2*pi*x*vec'));
end
end
end
The problem is that I want to find the sum when the number of dimensions are not known ahead of time, using some kind of variable nested loops hopefully. Matlab uses combvec, but it returns all possible combinations of vectors, which is not required as we only compute the sum. When there are many dimensions, combvec returning all combinations is not feasible memory wise.
Appreciate any ideas towards solutions.
PS: I want to do this at high number of dimensions, for example 650, as in machine learning.
Based on https://www.mathworks.com/matlabcentral/answers/345551-function-with-varying-number-of-for-loops I came up with the following code (I haven't tested it for very large number of indices!):
function sum = fun(x, block_length)
sum = 0;
n = numel(block_length); % Number of loops
vec = -ones(1, n) .* block_length; % Index vector
ready = false;
while ~ready
tv = vec * vec';
sum = sum + 1/(1+tv)*cos(2*pi*x*vec');
% Update the index vector:
ready = true; % Assume that the WHILE loop is ready
for k = 1:n
vec(k) = vec(k) + 1;
if vec(k) <= block_length(k)
ready = false;
break; % v(k) increased successfully, leave "for k" loop
end
vec(k) = -1 * block_length(k); % v(k) reached the limit, reset it
end
end
end
where x and block_length should be both 1-x-n vectors.
The idea is that, instead of using explicitly nested loops, we use a vector of indices.
How good/efficient is this when tackling the suggested use case where block_length can have 650 elements? Not much! Here's a "quick" test using merely 16 dimensions and a [-1, 1] range for the indices:
N = 16; tic; in = 0.1 * ones(1, N); sum = fun(in, ones(size(in))), toc;
which yields an elapsed time of 12.7 seconds on my laptop.

Implementing Simplex Method infinite loop

I am trying to implement a simplex algorithm following the rules I was given at my optimization course. The problem is
min c'*x s.t.
Ax = b
x >= 0
All vectors are assumes to be columns, ' denotes the transpose. The algorithm should also return the solution to dual LP. The rules to follow are:
Here, A_J denotes columns from A with indices in J and x_J, x_K denotes elements of vector x with indices in J or K respectively. Vector a_s is column s of matrix A.
Now I do not understand how this algorithm takes care of condition x >= 0, but I decided to give it a try and follow it step by step. I used Matlab for this and got the following code.
X = zeros(n, 1);
Y = zeros(m, 1);
% i. Choose starting basis J and K = {1,2,...,n} \ J
J = [4 5 6] % for our problem
K = setdiff(1:n, J)
% this while is for goto
while 1
% ii. Solve system A_J*\bar{x}_J = b.
xbar = A(:,J) \ b
% iii. Calculate value of criterion function with respect to current x_J.
fval = c(J)' * xbar
% iv. Calculate dual solution y from A_J^T*y = c_J.
y = A(:,J)' \ c(J)
% v. Calculate \bar{c}^T = c_K^T - u^T A_K. If \bar{c}^T >= 0, we have
% found the optimal solution. If not, select the smallest s \in K, such
% that c_s < 0. Variable x_s enters basis.
cbar = c(K)' - c(J)' * inv(A(:,J)) * A(:,K)
cbar = cbar'
tmp = findnegative(cbar)
if tmp == -1 % we have found the optimal solution since cbar >= 0
X(J) = xbar;
Y = y;
FVAL = fval;
return
end
s = findnegative(c, K) %x_s enters basis
% vi. Solve system A_J*\bar{a} = a_s. If \bar{a} <= 0, then the problem is
% unbounded.
abar = A(:,J) \ A(:,s)
if findpositive(abar) == -1 % we failed to find positive number
disp('The problem is unbounded.')
return;
end
% vii. Calculate v = \bar{x}_J / \bar{a} and find the smallest rho \in J,
% such that v_rho > 0. Variable x_rho exits basis.
v = xbar ./ abar
rho = J(findpositive(v))
% viii. Update J and K and goto ii.
J = setdiff(J, rho)
J = union(J, s)
K = setdiff(K, s)
K = union(K, rho)
end
Functions findpositive(x) and findnegative(x, S) return the first index of positive or negative value in x. S is the set of indices, over which we look at. If S is omitted, whole vector is checked. Semicolons are omitted for debugging purposes.
The problem I tested this code on is
c = [-3 -1 -3 zeros(1,3)];
A = [2 1 1; 1 2 3; 2 2 1];
A = [A eye(3)];
b = [2; 5; 6];
The reason for zeros(1,3) and eye(3) is that the problem is inequalities and we need slack variables. I have set starting basis to [4 5 6] because the notes say that starting basis should be set to slack variables.
Now, what happens during execution is that on first run of while, variable with index 1 enters basis (in Matlab, indices go from 1 on) and 4 exits it and that is reasonable. On the second run, 2 enters the basis (since it is the smallest index such that c(idx) < 0 and 1 leaves it. But now on the next iteration, 1 enters basis again and I understand why it enters, because it is the smallest index, such that c(idx) < 0. But here the looping starts. I assume that should not have happened, but following the rules I cannot see how to prevent this.
I guess that there has to be something wrong with my interpretation of the notes but I just cannot see where I am wrong. I also remember that when we solved LP on the paper, we were updating our subjective function on each go, since when a variable entered basis, we removed it from the subjective function and expressed that variable in subj. function with the expression from one of the equalities, but I assume that is different algorithm.
Any remarks or help will be highly appreciated.
The problem has been solved. Turned out that the point 7 in the notes was wrong. Instead, point 7 should be

Matlab calculate the product of an expression

I'm basicaly trying to find the product of an expression that goes like this:
(x-(N-1)/2).....(x+(N-1)/2) for even value of N
x is a value that I will set at the beginning that changes too but that is a different problem...
let's say for the sake of argument that for now x is a constant (ex x=1)
example for N=6
(x-5/2)(x-3/2)(x-1/2)(x+1/2)(x+3/2)*(x+5/2)
the idea was to create a row vector every element of which is each individual term (P(1)=x-5/2) (P(2)=x-3/2)...etc and then calculate its product
N=6;
x=1;
P=ones(1,N);
for k=(-N-1)/2:(N-1)/2
for n=1:N
P(n)=(x-k);
end
end
y=prod(P);
instead this creates a vector that takes only the first value of the epxression and then
repeats the same value at each cell.
there is obviously a fundamental problem with my loop but I just can't see it.
So if anyone can help with that OR suggest a better way to calculate the product I would be grateful.
Use vectorized commands
Why use a loop when you can use vectorized commands like prod?
y = prod(2 * x + [-N + 1 : 2 : N - 1]) / 2;
For convenience, you may want to define an anonymous function for it:
f = #(N,x) reshape(prod(bsxfun(#plus, 2 * x(:), -N + 1 : 2 : N - 1) / 2, 2), size(x));
Note that the function is compatible with a (row or column) vector input x.
Tests in MATLAB's Command Window
>> f(6, [2,2]')
ans =
-14.7656
4.9219
-3.5156
4.9219
-14.7656
>> f(6, [2,2])
ans =
-14.7656 4.9219 -3.5156 4.9219 -14.7656
Benchmark
Here is a comparison of rayreng's approach versus mine. The former emerges as the clear winner... :'( ...at least as N increases.
Varying N, fixed x
Fixed N (= 10), vector x of varying length
Fixed N (= 100), vector x of varying length
Benchmark code
function benchmark
% varying N, fixed x
clear all
n = logspace(2,4,20)';
x = rand(1000,1);
tr = zeros(size(n));
tj = tr;
for k = 1 : numel(n)
% rayreng's approach (poly/polyval)
fr = #() rayreng(n(k), x);
tr(k) = timeit(fr);
% Jubobs's approach (prod/reshape/bsxfun)
fj = #() jubobs(n(k), x);
tj(k) = timeit(fj);
end
figure
hold on
plot(n, tr, 'bo')
plot(n, tj, 'ro')
hold off
xlabel('N')
ylabel('time (s)')
legend('rayreng', 'jubobs')
end
function y = jubobs(N,x)
y = reshape(prod(bsxfun(#plus,...
2 * x(:),...
-N + 1 : 2 : N - 1) / 2,...
2),...
size(x));
end
function y = rayreng(N, x)
p = poly(linspace(-(N-1)/2, (N-1)/2, N));
y = polyval(p, x);
end
function benchmark2
% fixed N, varying x
clear all
n = 100;
nx = round(logspace(2,4,20));
tr = zeros(size(n));
tj = tr;
for k = 1 : numel(nx)
disp(k)
x = rand(nx(k), 1);
% rayreng's approach (poly/polyval)
fr = #() rayreng(n, x);
tr(k) = timeit(fr);
% Jubobs's approach (prod/reshape/bsxfun)
fj = #() jubobs(n, x);
tj(k) = timeit(fj);
end
figure
hold on
plot(nx, tr, 'bo')
plot(nx, tj, 'ro')
hold off
xlabel('number of elements in vector x')
ylabel('time (s)')
legend('rayreng', 'jubobs')
title(['n = ' num2str(n)])
end
function y = jubobs(N,x)
y = reshape(prod(bsxfun(#plus,...
2 * x(:),...
-N + 1 : 2 : N - 1) / 2,...
2),...
size(x));
end
function y = rayreng(N, x)
p = poly(linspace(-(N-1)/2, (N-1)/2, N));
y = polyval(p, x);
end
An alternative
Alternatively, because the terms in your product form an arithmetic progression (each term is greater than the previous one by 1/2), you can use the formula for the product of an arithmetic progression.
I agree with #Jubobs in that you should avoid using for loops for this kind of computation. There are cases where for loops perform fast, but for something as simple as this, avoid using loops if possible.
An alternative approach to what Jubobs has suggested is that you can consider that polynomial equation to be in factored form where each factor denotes a root located at that particular location. You can use poly to convert these factors into a polynomial equation, then use polyval to evaluate the expression at the point you want. First, generate your roots by linspace where the points vary from -(N-1)/2 to (N-1)/2 and there are N of them, then plug this into poly. Finally, for any values of x, put this into polyval with the output of poly. The advantage of this approach is that you can evaluate multiple points of x in a single sweep.
Going with what you have, you would simply do this:
p = poly(linspace(-(N-1)/2, (N-1)/2, N));
out = polyval(p, x);
With your example, supposing that N = 6, this would be the output of the first line:
p =
1.0000 0 -8.7500 0 16.1875 0 -3.5156
As such, this is saying that when we expand out (x-5/2)(x-3/2)(x-1/2)(x+1/2)(x+3/2)(x+5/2), we get:
x^6 - 8.75x^4 + 16.1875x^2 - 3.5156
If we take a look at the roots of this equation, this is what we get:
r = roots(p)
r =
-2.5000
2.5000
-1.5000
1.5000
-0.5000
0.5000
As you can see, each term corresponds to one factor in your polynomial equation, so we do have the right mindset here. Now, all you have to do is use p with your values of x into polyval to obtain your results. For example, if I wanted to evaluate that polynomial from -2 <= x <= 2 where x is an integer, this is the result I get:
polyval(p, -2:2)
ans =
-14.7656 4.9219 -3.5156 4.9219 -14.7656
Therefore, when x = -2, the result is -14.7656 and so on.
Though I would recommend the solution by #Jubobs, it is also good to check what the issue is with your loop.
The first indication that something is wrong, is that you have a nested loop over 2 variables, and only index with one of them to store the result. Probably you just need a single loop.
Here is a loop that you may be interested in that should do roughly what you need:
N=6;
x=1;
k=(-N-1)/2:(N-1)/2
P = ones(size(k));
for n=1:numel(k)
P(n)=(x-k(n));
end
y=prod(P);
I tried to keep the code close to the original, so hopefully it is easy to understand.

Replacing zeros (or NANs) in a matrix with the previous element row-wise or column-wise in a fully vectorized way

I need to replace the zeros (or NaNs) in a matrix with the previous element row-wise, so basically I need this Matrix X
[0,1,2,2,1,0;
5,6,3,0,0,2;
0,0,1,1,0,1]
To become like this:
[0,1,2,2,1,1;
5,6,3,3,3,2;
0,0,1,1,1,1],
please note that if the first row element is zero it will stay like that.
I know that this has been solved for a single row or column vector in a vectorized way and this is one of the nicest way of doing that:
id = find(X);
X(id(2:end)) = diff(X(id));
Y = cumsum(X)
The problem is that the indexing of a matrix in Matlab/Octave is consecutive and increments columnwise so it works for a single row or column but the same exact concept cannot be applied but needs to be modified with multiple rows 'cause each of raw/column starts fresh and must be regarded as independent. I've tried my best and googled the whole google but coukldn’t find a way out. If I apply that same very idea in a loop it gets too slow cause my matrices contain 3000 rows at least. Can anyone help me out of this please?
Special case when zeros are isolated in each row
You can do it using the two-output version of find to locate the zeros and NaN's in all columns except the first, and then using linear indexing to fill those entries with their row-wise preceding values:
[ii jj] = find( (X(:,2:end)==0) | isnan(X(:,2:end)) );
X(ii+jj*size(X,1)) = X(ii+(jj-1)*size(X,1));
General case (consecutive zeros are allowed on each row)
X(isnan(X)) = 0; %// handle NaN's and zeros in a unified way
aux = repmat(2.^(1:size(X,2)), size(X,1), 1) .* ...
[ones(size(X,1),1) logical(X(:,2:end))]; %// positive powers of 2 or 0
col = floor(log2(cumsum(aux,2))); %// col index
ind = bsxfun(#plus, (col-1)*size(X,1), (1:size(X,1)).'); %'// linear index
Y = X(ind);
The trick is to make use of the matrix aux, which contains 0 if the corresponding entry of X is 0 and its column number is greater than 1; or else contains 2 raised to the column number. Thus, applying cumsum row-wise to this matrix, taking log2 and rounding down (matrix col) gives the column index of the rightmost nonzero entry up to the current entry, for each row (so this is a kind of row-wise "cummulative max" function.) It only remains to convert from column number to linear index (with bsxfun; could also be done with sub2ind) and use that to index X.
This is valid for moderate sizes of X only. For large sizes, the powers of 2 used by the code quickly approach realmax and incorrect indices result.
Example:
X =
0 1 2 2 1 0 0
5 6 3 0 0 2 3
1 1 1 1 0 1 1
gives
>> Y
Y =
0 1 2 2 1 1 1
5 6 3 3 3 2 3
1 1 1 1 1 1 1
You can generalize your own solution as follows:
Y = X.'; %'// Make a transposed copy of X
Y(isnan(Y)) = 0;
idx = find([ones(1, size(X, 1)); Y(2:end, :)]);
Y(idx(2:end)) = diff(Y(idx));
Y = reshape(cumsum(Y(:)), [], size(X, 1)).'; %'// Reshape back into a matrix
This works by treating the input data as a long vector, applying the original solution and then reshaping the result back into a matrix. The first column is always treated as non-zero so that the values don't propagate throughout rows. Also note that the original matrix is transposed so that it is converted to a vector in row-major order.
Modified version of Eitan's answer to avoid propagating values across rows:
Y = X'; %'
tf = Y > 0;
tf(1,:) = true;
idx = find(tf);
Y(idx(2:end)) = diff(Y(idx));
Y = reshape(cumsum(Y(:)),fliplr(size(X)))';
x=[0,1,2,2,1,0;
5,6,3,0,1,2;
1,1,1,1,0,1];
%Do it column by column is easier
x=x';
rm=0;
while 1
%fields to replace
l=(x==0);
%do nothing for the first row/column
l(1,:)=0;
rm2=sum(sum(l));
if rm2==rm
%nothing to do
break;
else
rm=rm2;
end
%replace zeros
x(l) = x(find(l)-1);
end
x=x';
I have a function I use for a similar problem for filling NaNs. This can probably be cutdown or sped up further - it's extracted from pre-existing code that has a bunch more functionality (forward/backward filling, maximum distance etc).
X = [
0 1 2 2 1 0
5 6 3 0 0 2
1 1 1 1 0 1
0 0 4 5 3 9
];
X(X == 0) = NaN;
Y = nanfill(X,2);
Y(isnan(Y)) = 0
function y = nanfill(x,dim)
if nargin < 2, dim = 1; end
if dim == 2, y = nanfill(x',1)'; return; end
i = find(~isnan(x(:)));
j = 1:size(x,1):numel(x);
j = j(ones(size(x,1),1),:);
ix = max(rep([1; i],diff([1; i; numel(x) + 1])),j(:));
y = reshape(x(ix),size(x));
function y = rep(x,times)
i = find(times);
if length(i) < length(times), x = x(i); times = times(i); end
i = cumsum([1; times(:)]);
j = zeros(i(end)-1,1);
j(i(1:end-1)) = 1;
y = x(cumsum(j));

average number of different values in a column

I had a question in Matlab. It is so, I try to take average of the different number of values ​​in a column. For example, if we have the column below,
X = [1 1 2 3 4 3 8 2 1 3 5 6 7 7 5]
first I want to start by taking the average of 5 values ​​and plot them. In the case above, I should receive three averages that I could plot. Then take 10 values ​​at a time and so on.
I wonder if you have to write custom code to fix it.
The fastest way is probably to rearrange your initial vector X into some matrix, with each column storing the required values to average:
A = reshape(X, N, []);
where N is the desired number of rows in the new matrix, and the empty brackets ([]) tell MATLAB to calculate the number of columns automatically. Then you can average each column using mean:
X_avg = mean(A);
Vector X_avg stores the result. This can be done in one line like so:
X_avg = mean(reshape(X, N, []));
Note that the number of elements in X has to be divisible by N, otherwise you'll have to either pad it first (e.g with zeroes), or handle the "leftover" tail elements separately:
tail = mod(numel(X), N);
X_avg = mean(reshape(X(1:numel(X) - tail), N, [])); %// Compute average values
X_avg(end + 1) = mean(X(end - tail + 1:end)); %// Handle leftover elements
Later on you can put this code in a loop, computing and plotting the average values for a different value of N in each iteration.
Example #1
X = [1 1 2 3 4 3 8 2 1 3 5 6 7 7 5];
N = 5;
tail = mod(numel(X), N);
X_avg = mean(reshape(X(1:numel(X) - tail), N, []))
X_avg(end + 1) = mean(X(end - tail + 1:end))
The result is:
X_avg =
2.2000 3.4000 6.0000
Example #2
Here's another example (this time the length of X is not divisible by N):
X = [1 1 2 3 4 3 8 2 1 3 5 6 7 7 5];
N = 10;
tail = mod(numel(X), N);
X_avg = mean(reshape(X(1:numel(X) - tail), N, []))
X_avg(end + 1) = mean(X(end - tail + 1:end))
The result is:
X_avg =
2.8000 6.0000
This should do the trick:
For a selected N (the number of values you want to take the average of):
N = 5;
mean_vals = arrayfun(#(n) mean(X(n-1+(1:N))),1:N:length(X))
Note: This does not check if Index exceeds matrix dimensions.
If you want to skip the last numbers, this should work:
mean_vals = arrayfun(#(n) mean(X(n-1+(1:N))),1:N:(length(X)-mod(length(X),N)));
To add the remaining values:
if mod(length(X),N) ~= 0
mean_vals(end+1) = mean(X(numel(X)+1-mod(length(X),N):end))
end
UPDATE: This is a modification of Eitan's first answer (before it was edited). It uses nanmean(), which takes the mean of all values that are not NaN. So, instead of filling the remaining rows with zeros, fill them with NaN, and just take the mean.
X = [X(:); NaN(mod(N - numel(X), N), 1)];
X_avg = nanmean(reshape(X, N, []));
It would be helpful if you posted some code and point out exactly what is not working.
As a first pointer. If
X = [1 1 2 3 4 3 8 2 1 3 5 6 7 7 5]
the three means in blocks of 5 you are interested in are
mean(X(1:5))
mean(X(6:10))
mean(X(11:15))
You will have to come up with a for loop or maybe some other way to iterate through the indices.
I think you want something like this (I didn't use Matlab in a while, I hope the syntax is right):
X = [1 1 2 3 4 3 8 2 1 3 5 6 7 7 5],
currentAmount=5,
block=0,
while(numel(X)<=currentAmount)
while(numel(X)<=currentAmount+block*currentAmount)
mean(X(block*currentAmount+1:block*currentAmount+currentAmount));
block =block+1;
end;
currentAmount = currentAmount+5;
block=0;
end
This code will first loop through all elements calculating means of 5 elements at a time. Then, it will expand to 10 elements. Then to 15, and so on, until the number of elements from which you want to make the mean is bigger than the number of elements in the column.
If you are looking to average K random samples in your N-dimensional vector, then you could use:
N = length(X);
K = 20; % or 10, or 30, or any integer less than or equal to N
indices = randperm(N, K); % gives you K random indices from the range 1:N
result = mean(X(indices)); % averages the values of X at the K random
% indices from above
A slightly more compact form would be:
K = 20;
result = mean(X(randperm(length(X), K)));
If you are just looking to take every K consecutive samples from the list and average them then I am sure one of the previous answers will give you what you want.
If you need to do this operation a lot, it might be worth writing your own function for it. I would recommend using #EitanT's basic idea: pad the data, reshape, take mean of each column. However, rather than including the zero-padded numbers at the end, I recommend taking the average of the "straggling" data points separately:
function m = meanOfN(x, N)
% function m = meanOfN(x, N)
% create groups of N elements of vector x
% and return their mean
% if numel(x) is not a multiple of N, the last value returned
% will be for fewer than N elements
Nf = N * floor( numel( x ) / N ); % largest multiple of N <= length of x
xr = reshape( x( 1:Nf ), N, []);
m = mean(xr);
if Nf < N
m = [m mean( x( Nf + 1:end ) )];
end
This function will return exactly what you were asking for: in the case of a 15 element vector with N=5, it returns 3 values. When the size of the input vector is not a multiple of N, the last value returned will be the "mean of what is left".
Often when you need to take the mean of a set of numbers, it is the "running average" that is of interest. So rather than getting [mean(x(1:5)) mean(x(6:10)) mean(11:15))], you might want
m(1) = mean(x(1:N));
m(2) = mean(x(2:N+1));
m(3) = mean(x(3:N+2));
...etc
That could be achieved using a simple convolution of your data with a vector of ones; for completeness, here is a possible way of coding that:
function m = meansOfN(x, n)
% function m = meansOfN(x, n)
% taking the running mean of the values in x
% over n samples. Returns a row vector of size (sizeof(x) - n + 1)
% if numel(x) < n, this returns an empty matrix
mv = ones(N,1) / N; % vector of ones, normalized
m = convn(x(:), mv, 'valid'); % perform 1D convolution
With these two functions in your path (save them in a file called meanOfN.m and meansOfN.m respectively), you can do anything you want. In any program you will be able to write
myMeans = meanOfN(1:30, 5);
myMeans2 = meansOfN(1:30, 6);
etc. Matlab will find the function, perform the calculation, return the result. Writing your custom functions for specific operations like this can be very helpful - not only does it keep your code clean, but you only have to test the function once...