Vectorize MATLAB code - matlab

Let's say we have three m-by-n matrices of equal size: A, B, C.
Every column in C represents a time series.
A is the running maximum (over a fixed window length) of each time series in C.
B is the running minimum (over a fixed window length) of each time series in C.
Is there a way to determine T in a vectorized way?
[nrows, ncols] = size(A);
T = zeros(nrows, ncols);
for row = 2:nrows %loop over the rows (except row #1).
for col = 1:ncols %loop over the columns.
if C(row, col) > A(row-1, col)
T(row, col) = 1;
elseif C(row, col) < B(row-1, col)
T(row, col) = -1;
else
T(row, col) = T(row-1, col);
end
end
end
This is what I've come up with so far:
T = zeros(m, n);
T(C > circshift(A,1)) = 1;
T(C < circshift(B,1)) = -1;

Well, the trouble was the dependency with the ELSE part of the conditional statement. So, after a long mental work-out, here's a way I summed up to vectorize the hell-outta everything.
Now, this approach is based on mapping. We get column-wise runs or islands of 1s corresponding to the 2D mask for the ELSE part and assign them the same tags. Then, we go to the start-1 along each column of each such run and store that value. Finally, indexing into each such start-1 with those tagged numbers, which would work as mapping indices would give us all the elements that are to be set in the new output.
Here's the implementation to fulfill all those aspirations -
%// Store sizes
[m1,n1] = size(A);
%// Masks corresponding to three conditions
mask1 = C(2:nrows,:) > A(1:nrows-1,:);
mask2 = C(2:nrows,:) < B(1:nrows-1,:);
mask3 = ~(mask1 | mask2);
%// All but mask3 set values as output
out = [zeros(1,n1) ; mask1 + (-1*(~mask1 & mask2))];
%// Proceed if any element in mask3 is set
if any(mask3(:))
%// Row vectors for appending onto matrices for matching up sizes
mask_appd = false(1,n1);
row_appd = zeros(1,n1);
%// Get 2D mapped indices
df = diff([mask_appd ; mask3],[],1)==1;
cdf = cumsum(df,1);
offset = cumsum([0 max(cdf(:,1:end-1),[],1)]);
map_idx = bsxfun(#plus,cdf,offset);
map_idx(map_idx==0) = 1;
%// Extract the values to be used for setting into new places
A1 = out([df ; false(1,n1)]);
%// Map with the indices obtained earlier and set at places from mask3
newval = [row_appd ; A1(map_idx)];
mask3_appd = [mask_appd ; mask3];
out(mask3_appd) = newval(mask3_appd);
end

Doing this vectorized is rather difficult because the current row's output depends on the previous row's output. Doing vectorized operations usually means that each element should stand out on its own using some relationship that is independent of the other elements that surround it.
I don't have any input on how you would achieve this without a for loop but I can help you reduce your operations down to one instead of two. You can do the assignment vectorized per row, but I can't see how you'd do it all in one shot.
As such, try something like this instead:
[nrows, ncols] = size(A);
T = zeros(nrows, ncols);
for row = 2:nrows
out = T(row-1,:); %// Change - Make a copy of the previous row
out(C(row,:) > A(row-1,:)) = 1; %// Set those elements of C
%// in the current row that are larger
%// than the previous row of A to 1
out(C(row,:) < B(row-1,:)) = -1; %// Same logic but for B now and it's
%// less than and the value is -1 instead
T(row,:) = out; %// Assign to the output
end
I'm currently figuring out how to do this with any loops whatsoever. I'll keep you posted.

Related

Generate random 2D matrix with unique rows in octave/matlab

I want to generate a 2D matrix(1000x3) with random values in the range of 1 to 10 in octave. Using randi(10,1000,3) will generate a matrix with repeated row values. But I want to generate unique(unrepeated) rows. Is there any way that, I can do that?
You can do that easily by getting the cartesian product to create all possibilities and shuffle the array as follows. To create the cartesian product, you will need my custom cartprod.m function that generates a cartesian product.
C = cartprod(1:10,1:10,1:10);
The following line then shuffles the cartesian product C.
S = C(randperm( size(C,1) ),:);
Notes:
Every row in S is unique and you can verify that size( unique( S ) ) == 1000.
I should note that this code works on Matlab 2015a. I haven't tested it in Octave, which is what OP seems to be using. I've been told the syntax is pretty much identical though.
You can generate all possible three-item sequences drawn from 1 through 10, with replacement, using the following function:
function result = nchoosek_replacement(n, k)
%// Edge cases: just return an empty matrix
if k < 1 || n < 1 || k >= n
result = [];
return
end
reps = n^(k-1);
result = zeros(n^k, k);
cur_col = repmat(1:n, reps, 1);
result(:,1) = cur_col(:);
%// Base case: when k is 1, just return the
%// fully populated matrix 'result'
if k == 1
return
end
%// Recursively generate a matrix that will
%// be used to populate columns 2:end
next = nchoosek_replacement(n, k-1);
%// Repeatedly use the matrix above to
%// populate the matrix 'result'
for i = 1:n
cur_range = (i-1)*reps+1:i*reps;
result(cur_range, 2:end) = next;
end
end
With this function defined, you can now generate all possible sequences. In this case there are exactly 1000 so they could simply be shuffled with randperm. A more general approach is to sample from them with randsample, which would also allow for smaller matrices if desired:
max_value = 10;
row_size = 3;
num_rows = 1000;
possible = nchoosek_replacement(max_value, row_size);
indices = randsample(size(possible, 1), num_rows);
data = possible(indices, :);

Find the product of all entries of vector x

Here is what I am trying to do:
Let x be a vector with n entries x1,x2,...xn. Write a mat-lab program which computes the vector p with entries defined by
pk = X1*X2....Xk-1*Xk+1...Xn.
for each k =1,2,...n.
pk is the product of all the entries of x except xk. (use prod command of compute the product of all the entries, then divide by xk). Take the appropriate special action if either one of more the entries of x is zero. Using vectors throughout and no 'for' loop.
I spent too much time to figure out this problem. I still could not get it. Please help!
Brute force:
n = numel(x);
X = repmat(x(:),1,n); %// put vector in column form and repeat
X(1:n+1:end) = 1; %// make diagonal 1
result = prod(X); %// product of each column
Saving computations:
ind = find(x==0);
if numel(ind)>1 %// result is all zeros
result = zeros(size(x));
elseif numel(ind)==1 %// result is all zeros except at one entry
result = zeros(size(x));
result(ind) = prod(nonzeros(x));
else %// compute product of all elements and divide by each element
result = prod(x)./x;
end

General method to find submatrix in matlab matrix

I am looking for a 'good' way to find a matrix (pattern) in a larger matrix (arbitrary number of dimensions).
Example:
total = rand(3,4,5);
sub = total(2:3,1:3,3:4);
Now I want this to happen:
loc = matrixFind(total, sub)
In this case loc should become [2 1 3].
For now I am just interested in finding one single point (if it exists) and am not worried about rounding issues. It can be assumed that sub 'fits' in total.
Here is how I could do it for 3 dimensions, however it just feels like there is a better way:
total = rand(3,4,5);
sub = total(2:3,1:3,3:4);
loc = [];
for x = 1:size(total,1)-size(sub,1)+1
for y = 1:size(total,2)-size(sub,2)+1
for z = 1:size(total,3)-size(sub,3)+1
block = total(x:x+size(sub,1)-1,y:y+size(sub,2)-1,z:z+size(sub,3)-1);
if isequal(sub,block)
loc = [x y z]
end
end
end
end
I hope to find a workable solution for an arbitrary number of dimensions.
Here is low-performance, but (supposedly) arbitrary dimensional function. It uses find to create a list of (linear) indices of potential matching positions in total and then just checks if the appropriately sized subblock of total matches sub.
function loc = matrixFind(total, sub)
%matrixFind find position of array in another array
% initialize result
loc = [];
% pre-check: do all elements of sub exist in total?
elements_in_both = intersect(sub(:), total(:));
if numel(elements_in_both) < numel(unique(sub))
% if not, return nothing
return
end
% select a pivot element
% Improvement: use least common element in total for less iterations
pivot_element = sub(1);
% determine linear index of all occurences of pivot_elemnent in total
starting_positions = find(total == pivot_element);
% prepare cell arrays for variable length subscript vectors
[subscripts, subscript_ranges] = deal(cell([1, ndims(total)]));
for k = 1:length(starting_positions)
% fill subscript vector for starting position
[subscripts{:}] = ind2sub(size(total), starting_positions(k));
% add offsets according to size of sub per dimension
for m = 1:length(subscripts)
subscript_ranges{m} = subscripts{m}:subscripts{m} + size(sub, m) - 1;
end
% is subblock of total equal to sub
if isequal(total(subscript_ranges{:}), sub)
loc = [loc; cell2mat(subscripts)]; %#ok<AGROW>
end
end
end
This is based on doing all possible shifts of the original matrix total and comparing the upper-leftmost-etc sub-matrix of the shifted total with the sought pattern subs. Shifts are generated using strings, and are applied using circshift.
Most of the work is done vectorized. Only one level of loops is used.
The function finds all matchings, not just the first. For example:
>> total = ones(3,4,5,6);
>> sub = ones(3,3,5,6);
>> matrixFind(total, sub)
ans =
1 1 1 1
1 2 1 1
Here is the function:
function sol = matrixFind(total, sub)
nd = ndims(total);
sizt = size(total).';
max_sizt = max(sizt);
sizs = [ size(sub) ones(1,nd-ndims(sub)) ].'; % in case there are
% trailing singletons
if any(sizs>sizt)
error('Incorrect dimensions')
end
allowed_shift = (sizt-sizs);
max_allowed_shift = max(allowed_shift);
if max_allowed_shift>0
shifts = dec2base(0:(max_allowed_shift+1)^nd-1,max_allowed_shift+1).'-'0';
filter = all(bsxfun(#le,shifts,allowed_shift));
shifts = shifts(:,filter); % possible shifts of matrix "total", along
% all dimensions
else
shifts = zeros(nd,1);
end
for dim = 1:nd
d{dim} = 1:sizt(dim); % vectors with subindices per dimension
end
g = cell(1,nd);
[g{:}] = ndgrid(d{:}); % grid of subindices per dimension
gc = cat(nd+1,g{:}); % concatenated grid
accept = repmat(permute(sizs,[2:nd+1 1]), [sizt; 1]); % acceptable values
% of subindices in order to compare with matrix "sub"
ind_filter = find(all(gc<=accept,nd+1));
sol = [];
for shift = shifts
total_shifted = circshift(total,-shift);
if all(total_shifted(ind_filter)==sub(:))
sol = [ sol; shift.'+1 ];
end
end
For an arbitrary number of dimensions, you might try convn.
C = convn(total,reshape(sub(end:-1:1),size(sub)),'valid'); % flip dimensions of sub to be correlation
[~,indmax] = max(C(:));
% thanks to Eitan T for the next line
cc = cell(1,ndims(total)); [cc{:}] = ind2sub(size(C),indmax); subs = [cc{:}]
Thanks to Eitan T for the suggestion to use comma-separated lists for a generalized ind2sub.
Finally, you should test the result with isequal because this is not a normalized cross correlation, meaning that larger numbers in a local subregion will inflate the correlation value potentially giving false positives. If your total matrix is very inhomogeneous with regions of large values, you might need to search other maxima in C.

Removing a random number of columns from a matrix

I need to take away a random number of columns from an arbitrarily large matrix, I've put my attempt below, but I'm certain that there is a better way.
function new = reduceMatrices(original, colsToTakeAway)
a = colsToTakeAway(1);
b = colsToTakeAway(2);
c = colsToTakeAway(3);
x = original(1:a-1);
y = original(a+1:b-1);
z = original(b+1:c-1);
if c == size(original, 2);
new = [x,y,z];
elseif (c+1) == size(original, 2);
new = [x,y,z,c+1]
else
new = [x,y,z,c+1:size(original, 2)];
end
Here's one approach. First, generate a row vector of random numbers with numcols elements, where numcols is the number of columns in the original matrix:
rc = rand(1,numcols)
Next make a vector of 1s and 0s from this, for example
lv = rc>0.75
which will produce something like
0 1 1 0 1
and you can use Matlab's logical indexing feature to write
original(:,lv)
which will return only those columns of original which correspond to the 1s in lv.
It's not entirely clear from your question how you want to make the vector of column selections, but this should give you some ideas.
function newM = reduceMatrices(original, colsToTakeAway)
% define the columns to keep := cols \ colsToTakeAway
colsToKeep = setdiff(1:size(original,2), colsToTakeAway);
newM = original(:, colsToKeep);
end

Apply function to all rows

I have a function, ranker, that takes a vector and assigns numerical ranks to it in ascending order. For example,
ranker([5 1 3 600]) = [3 1 2 4] or
ranker([42 300 42 42 1 42] = [3.5 6 3.5 3.5 1 3.5] .
I am using a matrix, variable_data and I want to apply the ranker function to each row for all rows in variable data. This is my current solution, but I feel there is a way to vectorize it and have it as equally fast :p
variable_ranks = nan(size(variable_data));
for i=1:1:numel(nmac_ids)
variable_ranks(i,:) = ranker(abs(variable_data(i,:)));
end
If you place the matrix rows into a cell array, you can then apply a function to each cell.
Consider this simple example of applying the SORT function to each row
a = rand(10,3);
b = cell2mat( cellfun(#sort, num2cell(a,2), 'UniformOutput',false) );
%# same as: b = sort(a,2);
You can even do this:
b = cell2mat( arrayfun(#(i) sort(a(i,:)), 1:size(a,1), 'UniformOutput',false)' );
Again, you version with the for loop is probably faster..
With collaboration from Amro and Jonas
variable_ranks = tiedrank(variable_data')';
Ranker has been replaced by the Matlab function in the Stat toolbox (sorry for those who don't have it),
[R,TIEADJ] = tiedrank(X) computes the
ranks of the values in the vector X.
If any X values are tied, tiedrank
computes their average rank. The
return value TIEADJ is an adjustment
for ties required by the nonparametric
tests signrank and ranksum, and for
the computation of Spearman's rank
correlation.
TIEDRANK will compute along columns in Matlab 7.9.0 (R2009b), however it is undocumented. So by transposing the input matrix, rows turn into columns and will rank them. The second transpose is then used to organize the data in the same manner as the input. There in essence is a very classy hack :p
One way would be to rewrite ranker to take array input
sizeData = size(variable_data);
[sortedData,almostRanks] = sort(abs(variable_data),2);
[rowIdx,colIdx] = ndgrid(1:sizeData(1),1:sizeData(2));
linIdx = sub2ind(sizeData,rowIdx,almostRanks);
variable_ranks = variable_data;
variable_ranks(linIdx) = colIdx;
%# break ties by finding subsequent equal entries in sorted data
[rr,cc] = find(diff(sortedData,1,2) == 0);
ii = sub2ind(sizeData,rr,cc);
ii2 = sub2ind(sizeData,rr,cc+1);
ii = sub2ind(sizeData,rr,almostRanks(ii));
ii2 = sub2ind(sizeData,rr,almostRanks(ii2));
variable_ranks(ii) = variable_ranks(ii2);
EDIT
Instead, you can just use TIEDRANK from TMW (thanks, #Amro):
variable_rank = tiedrank(variable_data')';
I wrote a function that does this, it's on the FileExchange tiedrank_(X,dim). And it looks like this...
%[Step 0a]: force dim to be 1, and compress everything else into a single
%dimension. We will reverse this process at the end.
if dim > 1
otherDims = 1:length(size(X));
otherDims(dim) = [];
perm = [dim otherDims];
X = permute(X,perm);
end
originalSiz = size(X);
X = reshape(X,originalSiz(1),[]);
siz = size(X);
%[Step 1]: sort and get sorting indicies
[X,Ind] = sort(X,1);
%[Step 2]: create matrix [D], which has +1 at the start of consecutive runs
% and -1 at the end, with zeros elsewhere.
D = zeros(siz,'int8');
D(2:end-1,:) = diff(X(1:end-1,:) == X(2:end,:));
D(1,:) = X(1,:) == X(2,:);
D(end,:) = -( X(end,:) == X(end-1,:) );
clear X
%[Step 3]: calculate the averaged rank for each consecutive run
[a,~] = find(D);
a = reshape(a,2,[]);
h = sum(a,1)/2;
%[Step 4]: insert the troublseome ranks in the relevant places
L = zeros(siz);
L(D==1) = h;
L(D==-1) = -h;
L = cumsum(L);
L(D==-1) = h; %cumsum set these ranks to zero, but we wanted them to be h
clear D h
%[Step 5]: insert the simple ranks (i.e. the ones that didn't clash)
[L(~L),~] = find(~L);
%[Step 6]: assign the ranks to the relevant position in the matrix
Ind = bsxfun(#plus,Ind,(0:siz(2)-1)*siz(1)); %equivalent to using sub2ind + repmat
r(Ind) = L;
%[Step 0b]: As promissed, we reinstate the correct dimensional shape and order
r = reshape(r,originalSiz);
if dim > 1
r = ipermute(r,perm);
end
I hope that helps someone.