Removing a random number of columns from a matrix - matlab

I need to take away a random number of columns from an arbitrarily large matrix, I've put my attempt below, but I'm certain that there is a better way.
function new = reduceMatrices(original, colsToTakeAway)
a = colsToTakeAway(1);
b = colsToTakeAway(2);
c = colsToTakeAway(3);
x = original(1:a-1);
y = original(a+1:b-1);
z = original(b+1:c-1);
if c == size(original, 2);
new = [x,y,z];
elseif (c+1) == size(original, 2);
new = [x,y,z,c+1]
else
new = [x,y,z,c+1:size(original, 2)];
end

Here's one approach. First, generate a row vector of random numbers with numcols elements, where numcols is the number of columns in the original matrix:
rc = rand(1,numcols)
Next make a vector of 1s and 0s from this, for example
lv = rc>0.75
which will produce something like
0 1 1 0 1
and you can use Matlab's logical indexing feature to write
original(:,lv)
which will return only those columns of original which correspond to the 1s in lv.
It's not entirely clear from your question how you want to make the vector of column selections, but this should give you some ideas.

function newM = reduceMatrices(original, colsToTakeAway)
% define the columns to keep := cols \ colsToTakeAway
colsToKeep = setdiff(1:size(original,2), colsToTakeAway);
newM = original(:, colsToKeep);
end

Related

Print integers in a Matrix coming in order without duplicates

I have several matrices <1x1000> containing integers such as:
matrix = [0,0,0,0,0,30,30,30,40,40,50,50,50,40,0,0,0,30,30,30]
I want to print (disp, and later plot) them like this: 30,40,50,40,30. Basically ignore the duplicates if they come after each other.
Another example:
matrix = [0,0,0,0,10,10,10,10,50,50,50,50,10,10,10,50,50] shall give: 10,50,10,50
Help is very much appreciated!
Use this:
[~,c]=find([NaN diff(matrix)]);
output=matrix(c);
output = output(output~=0)
and to plot the output, simply use: plot(output)
Result = 0;
% loop over all nonzero values in matrix
for Element = matrix
if Element == Result(end)
% skip if equal
continue
else
% add new value
Result(end+1) = Element;
end
end
% discard zero entries
Result = Result(Result ~= 0);
All solutions provided so far use either loops or the function find which are both inefficient.
Just use matrix indexation:
[matrix((matrix(1:end-1)-matrix(2:end))~=0), matrix(end)]
ans =
0 30 40 50 40 0 30
By the way in your example are you discarting the 0s even if they come in repeated sequences?
Lets call the output matrix um then
um(1) = matrix(1);
j = 1;
for i=2: length(matrix)
% Ignore repeating numbers
if (um(j) ~= matrix(i))
j = j + 1;
um(j) = matrix(i);
end
end
% Remove zeros
um = um(um~=0);

Vectorize MATLAB code

Let's say we have three m-by-n matrices of equal size: A, B, C.
Every column in C represents a time series.
A is the running maximum (over a fixed window length) of each time series in C.
B is the running minimum (over a fixed window length) of each time series in C.
Is there a way to determine T in a vectorized way?
[nrows, ncols] = size(A);
T = zeros(nrows, ncols);
for row = 2:nrows %loop over the rows (except row #1).
for col = 1:ncols %loop over the columns.
if C(row, col) > A(row-1, col)
T(row, col) = 1;
elseif C(row, col) < B(row-1, col)
T(row, col) = -1;
else
T(row, col) = T(row-1, col);
end
end
end
This is what I've come up with so far:
T = zeros(m, n);
T(C > circshift(A,1)) = 1;
T(C < circshift(B,1)) = -1;
Well, the trouble was the dependency with the ELSE part of the conditional statement. So, after a long mental work-out, here's a way I summed up to vectorize the hell-outta everything.
Now, this approach is based on mapping. We get column-wise runs or islands of 1s corresponding to the 2D mask for the ELSE part and assign them the same tags. Then, we go to the start-1 along each column of each such run and store that value. Finally, indexing into each such start-1 with those tagged numbers, which would work as mapping indices would give us all the elements that are to be set in the new output.
Here's the implementation to fulfill all those aspirations -
%// Store sizes
[m1,n1] = size(A);
%// Masks corresponding to three conditions
mask1 = C(2:nrows,:) > A(1:nrows-1,:);
mask2 = C(2:nrows,:) < B(1:nrows-1,:);
mask3 = ~(mask1 | mask2);
%// All but mask3 set values as output
out = [zeros(1,n1) ; mask1 + (-1*(~mask1 & mask2))];
%// Proceed if any element in mask3 is set
if any(mask3(:))
%// Row vectors for appending onto matrices for matching up sizes
mask_appd = false(1,n1);
row_appd = zeros(1,n1);
%// Get 2D mapped indices
df = diff([mask_appd ; mask3],[],1)==1;
cdf = cumsum(df,1);
offset = cumsum([0 max(cdf(:,1:end-1),[],1)]);
map_idx = bsxfun(#plus,cdf,offset);
map_idx(map_idx==0) = 1;
%// Extract the values to be used for setting into new places
A1 = out([df ; false(1,n1)]);
%// Map with the indices obtained earlier and set at places from mask3
newval = [row_appd ; A1(map_idx)];
mask3_appd = [mask_appd ; mask3];
out(mask3_appd) = newval(mask3_appd);
end
Doing this vectorized is rather difficult because the current row's output depends on the previous row's output. Doing vectorized operations usually means that each element should stand out on its own using some relationship that is independent of the other elements that surround it.
I don't have any input on how you would achieve this without a for loop but I can help you reduce your operations down to one instead of two. You can do the assignment vectorized per row, but I can't see how you'd do it all in one shot.
As such, try something like this instead:
[nrows, ncols] = size(A);
T = zeros(nrows, ncols);
for row = 2:nrows
out = T(row-1,:); %// Change - Make a copy of the previous row
out(C(row,:) > A(row-1,:)) = 1; %// Set those elements of C
%// in the current row that are larger
%// than the previous row of A to 1
out(C(row,:) < B(row-1,:)) = -1; %// Same logic but for B now and it's
%// less than and the value is -1 instead
T(row,:) = out; %// Assign to the output
end
I'm currently figuring out how to do this with any loops whatsoever. I'll keep you posted.

Generate random 2D matrix with unique rows in octave/matlab

I want to generate a 2D matrix(1000x3) with random values in the range of 1 to 10 in octave. Using randi(10,1000,3) will generate a matrix with repeated row values. But I want to generate unique(unrepeated) rows. Is there any way that, I can do that?
You can do that easily by getting the cartesian product to create all possibilities and shuffle the array as follows. To create the cartesian product, you will need my custom cartprod.m function that generates a cartesian product.
C = cartprod(1:10,1:10,1:10);
The following line then shuffles the cartesian product C.
S = C(randperm( size(C,1) ),:);
Notes:
Every row in S is unique and you can verify that size( unique( S ) ) == 1000.
I should note that this code works on Matlab 2015a. I haven't tested it in Octave, which is what OP seems to be using. I've been told the syntax is pretty much identical though.
You can generate all possible three-item sequences drawn from 1 through 10, with replacement, using the following function:
function result = nchoosek_replacement(n, k)
%// Edge cases: just return an empty matrix
if k < 1 || n < 1 || k >= n
result = [];
return
end
reps = n^(k-1);
result = zeros(n^k, k);
cur_col = repmat(1:n, reps, 1);
result(:,1) = cur_col(:);
%// Base case: when k is 1, just return the
%// fully populated matrix 'result'
if k == 1
return
end
%// Recursively generate a matrix that will
%// be used to populate columns 2:end
next = nchoosek_replacement(n, k-1);
%// Repeatedly use the matrix above to
%// populate the matrix 'result'
for i = 1:n
cur_range = (i-1)*reps+1:i*reps;
result(cur_range, 2:end) = next;
end
end
With this function defined, you can now generate all possible sequences. In this case there are exactly 1000 so they could simply be shuffled with randperm. A more general approach is to sample from them with randsample, which would also allow for smaller matrices if desired:
max_value = 10;
row_size = 3;
num_rows = 1000;
possible = nchoosek_replacement(max_value, row_size);
indices = randsample(size(possible, 1), num_rows);
data = possible(indices, :);

Using elements of a vector to set elements of a matrix

I have a vector whose elements identify the indices (per column) that I need to set in a different matrix. Specifically, I have:
A = 7
1
2
and I need to create a matrix B with some number of rows of zeros, except for the elements identified by A. In other words, I want B:
B = zeros(10, 3); % number of rows is known; num columns = size(A)
B(A(1), 1) = 1
B(A(2), 2) = 1
B(A(3), 3) = 1
I would like to do this without having to write a loop.
Any pointers would be appreciated.
Thanks.
Use linear indexing:
B = zeros(10, 3);
B(A(:).'+ (0:numel(A)-1)*size(B,1)) = 1;
The second line can be written equivalently with sub2ind (may be a little slower):
B(sub2ind(size(B), A(:).', 1:numel(A))) = 1;

Apply function to all rows

I have a function, ranker, that takes a vector and assigns numerical ranks to it in ascending order. For example,
ranker([5 1 3 600]) = [3 1 2 4] or
ranker([42 300 42 42 1 42] = [3.5 6 3.5 3.5 1 3.5] .
I am using a matrix, variable_data and I want to apply the ranker function to each row for all rows in variable data. This is my current solution, but I feel there is a way to vectorize it and have it as equally fast :p
variable_ranks = nan(size(variable_data));
for i=1:1:numel(nmac_ids)
variable_ranks(i,:) = ranker(abs(variable_data(i,:)));
end
If you place the matrix rows into a cell array, you can then apply a function to each cell.
Consider this simple example of applying the SORT function to each row
a = rand(10,3);
b = cell2mat( cellfun(#sort, num2cell(a,2), 'UniformOutput',false) );
%# same as: b = sort(a,2);
You can even do this:
b = cell2mat( arrayfun(#(i) sort(a(i,:)), 1:size(a,1), 'UniformOutput',false)' );
Again, you version with the for loop is probably faster..
With collaboration from Amro and Jonas
variable_ranks = tiedrank(variable_data')';
Ranker has been replaced by the Matlab function in the Stat toolbox (sorry for those who don't have it),
[R,TIEADJ] = tiedrank(X) computes the
ranks of the values in the vector X.
If any X values are tied, tiedrank
computes their average rank. The
return value TIEADJ is an adjustment
for ties required by the nonparametric
tests signrank and ranksum, and for
the computation of Spearman's rank
correlation.
TIEDRANK will compute along columns in Matlab 7.9.0 (R2009b), however it is undocumented. So by transposing the input matrix, rows turn into columns and will rank them. The second transpose is then used to organize the data in the same manner as the input. There in essence is a very classy hack :p
One way would be to rewrite ranker to take array input
sizeData = size(variable_data);
[sortedData,almostRanks] = sort(abs(variable_data),2);
[rowIdx,colIdx] = ndgrid(1:sizeData(1),1:sizeData(2));
linIdx = sub2ind(sizeData,rowIdx,almostRanks);
variable_ranks = variable_data;
variable_ranks(linIdx) = colIdx;
%# break ties by finding subsequent equal entries in sorted data
[rr,cc] = find(diff(sortedData,1,2) == 0);
ii = sub2ind(sizeData,rr,cc);
ii2 = sub2ind(sizeData,rr,cc+1);
ii = sub2ind(sizeData,rr,almostRanks(ii));
ii2 = sub2ind(sizeData,rr,almostRanks(ii2));
variable_ranks(ii) = variable_ranks(ii2);
EDIT
Instead, you can just use TIEDRANK from TMW (thanks, #Amro):
variable_rank = tiedrank(variable_data')';
I wrote a function that does this, it's on the FileExchange tiedrank_(X,dim). And it looks like this...
%[Step 0a]: force dim to be 1, and compress everything else into a single
%dimension. We will reverse this process at the end.
if dim > 1
otherDims = 1:length(size(X));
otherDims(dim) = [];
perm = [dim otherDims];
X = permute(X,perm);
end
originalSiz = size(X);
X = reshape(X,originalSiz(1),[]);
siz = size(X);
%[Step 1]: sort and get sorting indicies
[X,Ind] = sort(X,1);
%[Step 2]: create matrix [D], which has +1 at the start of consecutive runs
% and -1 at the end, with zeros elsewhere.
D = zeros(siz,'int8');
D(2:end-1,:) = diff(X(1:end-1,:) == X(2:end,:));
D(1,:) = X(1,:) == X(2,:);
D(end,:) = -( X(end,:) == X(end-1,:) );
clear X
%[Step 3]: calculate the averaged rank for each consecutive run
[a,~] = find(D);
a = reshape(a,2,[]);
h = sum(a,1)/2;
%[Step 4]: insert the troublseome ranks in the relevant places
L = zeros(siz);
L(D==1) = h;
L(D==-1) = -h;
L = cumsum(L);
L(D==-1) = h; %cumsum set these ranks to zero, but we wanted them to be h
clear D h
%[Step 5]: insert the simple ranks (i.e. the ones that didn't clash)
[L(~L),~] = find(~L);
%[Step 6]: assign the ranks to the relevant position in the matrix
Ind = bsxfun(#plus,Ind,(0:siz(2)-1)*siz(1)); %equivalent to using sub2ind + repmat
r(Ind) = L;
%[Step 0b]: As promissed, we reinstate the correct dimensional shape and order
r = reshape(r,originalSiz);
if dim > 1
r = ipermute(r,perm);
end
I hope that helps someone.