How to create doubles from cell arrays? - matlab

I would like to ask if there is a more efficient code to do the following task:
a = cell(10,1);
for i = 1 : 10
a{i,1} = randn(200,5);
end
for j =1:5
b{j} = [a{1,1}(:,j) a{2,1}(:,j) a{3,1}(:,j) a{4,1}(:,j) a{5,1}(:,j)];
end
Thank you!

Your solution works just fine. This is slightly more compact (and easier to generalize). If all cells contain matrices of the same size, you can merge them in one matrix, and pick one column every n:
for i = 1 : 10
a{i,1} = randn(200,5);
end
% Transform first five cells in one big matrix
c = cat(2,(a{1:5}));
n = size(a{1} , 2);
b = cell(5,1);
for j =1:5
% Take one column every 5 (or every "n" in general)
b{j} = c(: , 1:n:end );
end

Related

Fastest way of finding the only index of vector b where array A(i,j) == b

I have 2 big arrays A and b:
A: 10.000++ rows, 4 columns, not unique integers
b: vector with 500.000++ elements, unique integers
Due to the uniqueness of the values of b, I need to find the only index of b, where A(i,j) == b.
What I started with is
[rows,columns] = size(A);
B = zeros(rows,columns);
for i = 1 : rows
for j = 1 : columns
B(i,j) = find(A(i,j)==b,1);
end
end
This takes approx 5.5 seconds to compute, which is way to long, since A and b can be significantly bigger... That in mind I tried to speed up the code by using logical indexing and reducing the for-loops
[rows,columns] = size(A);
B = zeros(rows,columns);
for idx = 1 : numel(b)
B(A==b(idx)) = idx;
end
Sadly this takes even longer: 21 seconds
I even tried to do use bsxfun
for i = 1 : columns
[I,J] = find(bsxfun(#eq,A(:,i),b))
... stitch B together ...
end
but with a bigger arrays the maximum array size is quickly exceeded (102,9GB...).
Can you help me find a faster solution to this? Thanks in advance!
EDIT: I extended find(A(i,j)==b,1), which speeds up the algorithm by factor 2! Thank you, but overall still too slow... ;)
The function ismember is the right tool for this:
[~,B] = ismember(A,b);
Test code:
function so
A = rand(1000,4);
b = unique([A(:);rand(2000,1)]);
B1 = op1(A,b);
B2 = op2(A,b);
isequal(B1,B2)
tic;op1(A,b);op1(A,b);op1(A,b);op1(A,b);toc
tic;op2(A,b);op2(A,b);op2(A,b);op2(A,b);toc
end
function B = op1(A,b)
B = zeros(size(A));
for i = 1:numel(A)
B(i) = find(A(i)==b,1);
end
end
function B = op2(A,b)
[~,B] = ismember(A,b);
end
I ran this on Octave, which is not as fast with loops as MATLAB. It also doesn't have the timeit function, hence the crappy timing using tic/toc (sorry for that). In Octave, op2 is more than 100 times faster than op1. Timings will be different in MATLAB, but ismember should still be the fastest option. (Note I also replaced your double loop with a single loop, this is the same but simpler and probably faster.)
If you want to repeatedly do the search in b, it is worthwhile to sort b first, and implement your own binary search. This will avoid the checks and sorting that ismember does. See this other question.
Assuming that you have positive integers you can use array indexing:
mm = max(max(A(:)),max(b(:)));
idxs = sparse(b,1,1:numel(b),mm,1);
result = full(idxs(A));
If the range of values is small you can use dense matrix instead of sparse matrix:
mm = max(max(A(:)),max(b(:)));
idx = zeros(mm,1);
idx(b)=1:numel(b);
result = idx(A);

Shifting repeating rows to a new column in a matrix

I am working with a n x 1 matrix, A, that has repeating values inside it:
A = [0;1;2;3;4; 0;1;2;3;4; 0;1;2;3;4; 0;1;2;3;4]
which correspond to an n x 1 matrix of B values:
B = [2;4;6;8;10; 3;5;7;9;11; 4;6;8;10;12; 5;7;9;11;13]
I am attempting to produce a generalised code to place each repetition into a separate column and store it into Aa and Bb, e.g.:
Aa = [0 0 0 0 Bb = [2 3 4 5
1 1 1 1 4 5 6 7
2 2 2 2 6 7 8 9
3 3 3 3 8 9 10 11
4 4 4 4] 10 11 12 13]
Essentially, each repetition from A and B needs to be copied into the next column and then deleted from the first column
So far I have managed to identify how many repetitions there are and copy the entire column over to the next column and then the next for the amount of repetitions there are but my method doesn't shift the matrix rows to columns as such.
clc;clf;close all
A = [0;1;2;3;4;0;1;2;3;4;0;1;2;3;4;0;1;2;3;4];
B = [2;4;6;8;10;3;5;7;9;11;4;6;8;10;12;5;7;9;11;13];
desiredCol = 1; %next column to go to
destinationCol = 0; %column to start on
n = length(A);
for i = 2:1:n-1
if A == 0;
A = [ A(:, 1:destinationCol)...
A(:, desiredCol+1:destinationCol)...
A(:, desiredCol)...
A(:, destinationCol+1:end) ];
end
end
A = [...] retrieved from Move a set of N-rows to another column in MATLAB
Any hints would be much appreciated. If you need further explanation, let me know!
Thanks!
Given our discussion in the comments, all you need is to use reshape which converts a matrix of known dimensions into an output matrix with specified dimensions provided that the number of elements match. You wish to transform a vector which has a set amount of repeating patterns into a matrix where each column has one of these repeating instances. reshape creates a matrix in column-major order where values are sampled column-wise and the matrix is populated this way. This is perfect for your situation.
Assuming that you already know how many "repeats" you're expecting, we call this An, you simply need to reshape your vector so that it has T = n / An rows where n is the length of the vector. Something like this will work.
n = numel(A); T = n / An;
Aa = reshape(A, T, []);
Bb = reshape(B, T, []);
The third parameter has empty braces and this tells MATLAB to infer how many columns there will be given that there are T rows. Technically, this would simply be An columns but it's nice to show you how flexible MATLAB can be.
If you say you already know the repeated subvector, and the number of times it repeats then it is relatively straight forward:
First make your new A matrix with the repmat function.
Then remap your B vector to the same size as you new A matrix
% Given that you already have the repeated subvector Asub, and the number
% of times it repeats; An:
Asub = [0;1;2;3;4];
An = 4;
lengthAsub = length(Asub);
Anew = repmat(Asub, [1,An]);
% If you can assume that the number of elements in B is equal to the number
% of elements in A:
numberColumns = size(Anew, 2);
newB = zeros(size(Anew));
for i = 1:numberColumns
indexStart = (i-1) * lengthAsub + 1;
indexEnd = indexStart + An;
newB(:,i) = B(indexStart:indexEnd);
end
If you don't know what is in your original A vector, but you do know it is repetitive, if you assume that the pattern has no repeats you can use the find function to find when the first element is repeated:
lengthAsub = find(A(2:end) == A(1), 1);
Asub = A(1:lengthAsub);
An = length(A) / lengthAsub
Hopefully this fits in with your data: the only reason it would not is if your subvector within A is a pattern which does not have unique numbers, such as:
A = [0;1;2;3;2;1;0; 0;1;2;3;2;1;0; 0;1;2;3;2;1;0; 0;1;2;3;2;1;0;]
It is worth noting that from the above intuitively you would have lengthAsub = find(A(2:end) == A(1), 1) - 1;, But this is not necessary because you are already effectively taking the one off by only looking in the matrix A(2:end).

Removing a random number of columns from a matrix

I need to take away a random number of columns from an arbitrarily large matrix, I've put my attempt below, but I'm certain that there is a better way.
function new = reduceMatrices(original, colsToTakeAway)
a = colsToTakeAway(1);
b = colsToTakeAway(2);
c = colsToTakeAway(3);
x = original(1:a-1);
y = original(a+1:b-1);
z = original(b+1:c-1);
if c == size(original, 2);
new = [x,y,z];
elseif (c+1) == size(original, 2);
new = [x,y,z,c+1]
else
new = [x,y,z,c+1:size(original, 2)];
end
Here's one approach. First, generate a row vector of random numbers with numcols elements, where numcols is the number of columns in the original matrix:
rc = rand(1,numcols)
Next make a vector of 1s and 0s from this, for example
lv = rc>0.75
which will produce something like
0 1 1 0 1
and you can use Matlab's logical indexing feature to write
original(:,lv)
which will return only those columns of original which correspond to the 1s in lv.
It's not entirely clear from your question how you want to make the vector of column selections, but this should give you some ideas.
function newM = reduceMatrices(original, colsToTakeAway)
% define the columns to keep := cols \ colsToTakeAway
colsToKeep = setdiff(1:size(original,2), colsToTakeAway);
newM = original(:, colsToKeep);
end

Extract every element except every n-th element of vector

Given a vector
A = [1,2,3,...,100]
I want to extract all elements, except every n-th. So, for n=5, my output should be
B = [1,2,3,4,6,7,8,9,11,...]
I know that you can access every n-th element by
A(5:5:end)
but I need something like the inverse command.
If this doesn't exist I would iterate over the elements and skip every n-th entry, but that would be the dirty way.
You can eliminate elements like this:
A = 1:100;
removalList = 1:5:100;
A(removalList) = [];
Use a mask. Let's say you have
A = 1 : 100;
Then
m = mod(0 : length(A) - 1, 5);
will be a vector of the same length as A containing the repeated sequence 0 1 2 3 4.
You want everything from A except the elements where m == 4, i.e.
B = A(m ~= 4);
will result in
B == [1 2 3 4 6 7 8 9 11 12 13 14 16 ...]
Or you can use logical indexing:
n = 5; % remove the fifth
idx = logical(zeroes(size(A))); % creates a blank mask
idx(n) = 1; % makes the nth element 1
A(idx) = []; % ta-da!
About the "inversion" command you cited, it is possible to achieve that behavior using logical indexing. You can negate the vector to transform every 1 in 0, and vice-versa.
So, this code will remove any BUT the fifth element:
negatedIdx = ~idx;
A(negatedIdx) = [];
why not use it like this?
say A is your vector
A = 1:100
n = 5
B = A([1:n-1,n+1:end])
then
B=[1 2 3 4 6 7 8 9 10 ...]
One possible solution for your problem is the function setdiff().
In your specific case, the solution would be:
lenA = length(A);
index = setdiff(1:lenA,n:n:lenA);
B = A(index)
If you do it all at once, you can avoid both extra variables:
B = A( setdiff(1:end,n:n:end) )
However, Logical Indexing is a faster option, as tested:
lenA = length(A);
index = true(1, lenA);
index(n:n:lenA) = false;
B = A(index)
All these codes assume that you have specified the variable n, and can adapt to a different value.
For the shortest amount of code, you were nearly there all ready. If you want to adjust your existing array use:
A(n:n:end)=[];
Or if you want a new array called B:
B=A;
B(n:n:end)=[];

Apply function to all rows

I have a function, ranker, that takes a vector and assigns numerical ranks to it in ascending order. For example,
ranker([5 1 3 600]) = [3 1 2 4] or
ranker([42 300 42 42 1 42] = [3.5 6 3.5 3.5 1 3.5] .
I am using a matrix, variable_data and I want to apply the ranker function to each row for all rows in variable data. This is my current solution, but I feel there is a way to vectorize it and have it as equally fast :p
variable_ranks = nan(size(variable_data));
for i=1:1:numel(nmac_ids)
variable_ranks(i,:) = ranker(abs(variable_data(i,:)));
end
If you place the matrix rows into a cell array, you can then apply a function to each cell.
Consider this simple example of applying the SORT function to each row
a = rand(10,3);
b = cell2mat( cellfun(#sort, num2cell(a,2), 'UniformOutput',false) );
%# same as: b = sort(a,2);
You can even do this:
b = cell2mat( arrayfun(#(i) sort(a(i,:)), 1:size(a,1), 'UniformOutput',false)' );
Again, you version with the for loop is probably faster..
With collaboration from Amro and Jonas
variable_ranks = tiedrank(variable_data')';
Ranker has been replaced by the Matlab function in the Stat toolbox (sorry for those who don't have it),
[R,TIEADJ] = tiedrank(X) computes the
ranks of the values in the vector X.
If any X values are tied, tiedrank
computes their average rank. The
return value TIEADJ is an adjustment
for ties required by the nonparametric
tests signrank and ranksum, and for
the computation of Spearman's rank
correlation.
TIEDRANK will compute along columns in Matlab 7.9.0 (R2009b), however it is undocumented. So by transposing the input matrix, rows turn into columns and will rank them. The second transpose is then used to organize the data in the same manner as the input. There in essence is a very classy hack :p
One way would be to rewrite ranker to take array input
sizeData = size(variable_data);
[sortedData,almostRanks] = sort(abs(variable_data),2);
[rowIdx,colIdx] = ndgrid(1:sizeData(1),1:sizeData(2));
linIdx = sub2ind(sizeData,rowIdx,almostRanks);
variable_ranks = variable_data;
variable_ranks(linIdx) = colIdx;
%# break ties by finding subsequent equal entries in sorted data
[rr,cc] = find(diff(sortedData,1,2) == 0);
ii = sub2ind(sizeData,rr,cc);
ii2 = sub2ind(sizeData,rr,cc+1);
ii = sub2ind(sizeData,rr,almostRanks(ii));
ii2 = sub2ind(sizeData,rr,almostRanks(ii2));
variable_ranks(ii) = variable_ranks(ii2);
EDIT
Instead, you can just use TIEDRANK from TMW (thanks, #Amro):
variable_rank = tiedrank(variable_data')';
I wrote a function that does this, it's on the FileExchange tiedrank_(X,dim). And it looks like this...
%[Step 0a]: force dim to be 1, and compress everything else into a single
%dimension. We will reverse this process at the end.
if dim > 1
otherDims = 1:length(size(X));
otherDims(dim) = [];
perm = [dim otherDims];
X = permute(X,perm);
end
originalSiz = size(X);
X = reshape(X,originalSiz(1),[]);
siz = size(X);
%[Step 1]: sort and get sorting indicies
[X,Ind] = sort(X,1);
%[Step 2]: create matrix [D], which has +1 at the start of consecutive runs
% and -1 at the end, with zeros elsewhere.
D = zeros(siz,'int8');
D(2:end-1,:) = diff(X(1:end-1,:) == X(2:end,:));
D(1,:) = X(1,:) == X(2,:);
D(end,:) = -( X(end,:) == X(end-1,:) );
clear X
%[Step 3]: calculate the averaged rank for each consecutive run
[a,~] = find(D);
a = reshape(a,2,[]);
h = sum(a,1)/2;
%[Step 4]: insert the troublseome ranks in the relevant places
L = zeros(siz);
L(D==1) = h;
L(D==-1) = -h;
L = cumsum(L);
L(D==-1) = h; %cumsum set these ranks to zero, but we wanted them to be h
clear D h
%[Step 5]: insert the simple ranks (i.e. the ones that didn't clash)
[L(~L),~] = find(~L);
%[Step 6]: assign the ranks to the relevant position in the matrix
Ind = bsxfun(#plus,Ind,(0:siz(2)-1)*siz(1)); %equivalent to using sub2ind + repmat
r(Ind) = L;
%[Step 0b]: As promissed, we reinstate the correct dimensional shape and order
r = reshape(r,originalSiz);
if dim > 1
r = ipermute(r,perm);
end
I hope that helps someone.