splitting a matlab matrix into several equal parts - matlab

I have a matrix of size 64500x17. It represents detected texton features that I have to use to find 5 centroids for kmeans.
What I need is:
split this matrix into 5 12900x17 matrices
find the means
concatenate these into a 5x17 matrix to feed in to the start parameter of kmeans.
I know how to do almost everything (cat, kmeans, etc), but I am merely trying to find a method for splitting the matrix into 5 parts, or summing/dividing into the desired size.
I am forbidden from overusing for loops (due to efficiency), unless absolutely necessary.
I can't find any pertinent example in other questions, so if this has been answered, please bear with me.

You can use mat2cell and this oneliner
C = mat2cell(A, repmat(12900, 5, 1), 17);
The second parameter to mat2cell is the row split of the matrix.
Now C is a cell array:
C =
[12900x17 double]
[12900x17 double]
[12900x17 double]
[12900x17 double]
[12900x17 double]
and the partial matrices can be accessed as
C{1} etc.

Just use indexing and store the extracted matrices in cells for easier handling:
data = rand(64500,17);
Nsubsets = 5;
Nsubsize = size(data,1)/Nsubsets;
splitted_data = cell(Nsubsets ,1);
splitted_data_means = cell(Nsubsets,1);
for ii=1:Nsubsets
splitted_data{ii} = data((ii-1)*Nsubsize + (1:Nsubsize),:);
splitted_data_means{ii} = mean(splitted_data{ii});
end
you can then join these means with:
joined_means = cell2mat(splitted_data_means);
Or just for the heck-of-it with the one-liner:
joined_means = cell2mat(arrayfun(#(ii) mean(data((ii-1)*12900+(1:12900),:)),(1:5)','uni',false));
which would be even simpler with #angainor's mat2cell:
joined_means = cell2mat(cellfun(#mean,mat2cell(data, 12900*ones(5,1), 17),'uni',false));

To take the first submatrix use colon notation:
A(1:12900,:)
then
A(12901:12900*2,:)
and so on.

The probably fastest solution is:
data = rand(64500,17);
Nsubsets = 5;
Nsubsize = size(data,1)/Nsubsets;
joined_means=squeeze(mean(reshape(data,Nsubsize,Nsubsets,size(data,2)),1));
Split the first and second dimension, then you can calculate the mean over the first dimension of Nsubsets elements each.

Related

Extract values from a vector and sort them based on their original squence

I have a vector of numbers (temperatures), and I am using the MATLAB function mink to extract the 5 smallest numbers from the vector to form a new variable. However, the numbers extracted using mink are automatically ordered from lowest to largest (of those 5 numbers). Ideally, I would like to retain the sequence of the numbers as they are arranged in the original vector. I hope my problem is easy to understand. I appreciate any advice.
The function mink that you use was introduced in MATLAB 2017b. It has (as Andras Deak mentioned) two output arguments:
[B,I] = mink(A,k);
The second output argument are the indices, such that B == A(I).
To obtain the set B but sorted as they appear in A, simply sort the vector of indices I:
B = A(sort(I));
For example:
>> A = [5,7,3,1,9,4,6];
>> [~,I] = mink(A,3);
>> A(sort(I))
ans =
3 1 4
For older versions of MATLAB, it is possible to reproduce mink using sort:
function [B,I] = mink(A,k)
[B,I] = sort(A);
B = B(1:k);
I = I(1:k);
Note that, in the above, you don't need the B output, your ordered_mink can be written as follows
function B = ordered_mink(A,k)
[~,I] = sort(A);
B = A(sort(I(1:k)));
Note: This solution assumes A is a vector. For matrix A, see Andras' answer, which he wrote up at the same time as this one.
First you'll need the corresponding indices for the extracted values from mink using its two-output form:
[vals, inds] = mink(array);
Then you only need to order the items in val according to increasing indices in inds. There are multiple ways to do this, but they all revolve around sorting inds and using the corresponding order on vals. The simplest way is to put these vectors into a matrix and sort the rows:
sorted_rows = sortrows([inds, vals]); % sort on indices
and then just extract the corresponding column
reordered_vals = sorted_rows(:,2); % items now ordered as they appear in "array"
A less straightforward possibility for doing the sorting after the above call to mink is to take the sorting order of inds and use its inverse to reverse-sort vals:
reverse_inds = inds; % just allocation, really
reverse_inds(inds) = 1:numel(inds); % contruct reverse permutation
reordered_vals = vals(reverse_inds); % should be the same as previously

Extract data from a Cell Array using a vector and converting into an array

I have a cell array [5x1] which all cells are column vectors such as:
exInt =
[46x1 double]
[54x1 double]
[40x1 double]
[51x1 double]
[ 9x1 double]
I need to have a vector (vec) containing the cells in extInt I need to extract and then I have to convert these into a single column array. Such as:
vec = [1,3];
Output = cell2mat(extInt{vec})
Output should become something an array [86x1 double].
The way I have coded I get:
Error using cell2mat
Too many input arguments.
If possible, I would like to have a solution not using a loop.
The best approach here is to use cat along with a comma-separted list created by {} indexing to yield the expected column vector. We specify the first dimension as the first argument since you have all column vectors and we want the output to also be a column vector.
out = cat(1, extInt{vec})
Given your input, cell2mat attempts to concatenate along the second dimension which will fail for your data since all of the data have different number of rows. This is why (in your example) you had to transpose the data prior to calling cell2mat.
Update
Here is a benchmark to compare execution times between the cat and cell2mat approaches.
function benchit()
nRows = linspace(10, 1000, 100);
[times1, times2] = deal(zeros(size(nRows)));
for k = 1:numel(nRows)
rows = nRows(k);
data = arrayfun(#(x)rand(randi([10, 50], 1), 1), 1:rows, 'uni', 0);
vec = 1:2:numel(data);
times1(k) = timeit(#()cat_method(data, vec));
data = arrayfun(#(x)rand(randi([10, 50], 1), 1), 1:rows, 'uni', 0);
vec = 1:2:numel(data);
times2(k) = timeit(#()cell2mat_method(data, vec));
end
figure
hplot(1) = plot(nRows, times1 * 1000, 'DisplayName', 'cat');
hold on
hplot(2) = plot(nRows, times2 * 1000, 'DisplayName', 'cell2mat');
ylabel('Execution Times (ms)')
xlabel('# of Cell Array Elements')
legend(hplot)
end
function out = cat_method(data, vec)
out = cat(1, data{vec});
end
function out = cell2mat_method(data, vec)
out = cell2mat(data(vec)');
end
The reason for the constant offset between the two is that cell2mat calls cat internally but adds some additional logic on top of it. If you just use cat directly, you circumvent that additional overhead.
You have a small error in your code
Change
Output = cell2mat(extInt{vec});
to
Output = cell2mat(extInt(vec));
For cells, both brackets and parentheses can be used to get information. You can read some more about it here, but to summarize:
Use curly braces {} for setting or getting the contents of cell arrays.
Use parentheses () for indexing into a cell array to collect a subset of cells together in another cell array.
In your example, using brackets with index vector vec will produce 2 separate outputs (I've made a shorter version of extInt below)
extInt = {[1],[2 3],[4 5 6]};
extInt{vec}
ans =
1
ans =
4 5 6
As this is 2 separate outputs, it will also be 2 separate input to the function cell2mat. As this function only takes one input you get an error.
One alternative is in your own solution. Take the two outputs and place them inside a new (unnamed) cell
{extInt{vec}}
ans =
[1] [1x3 double]
Now, this (single) result goes into cell2mat without a problem.
(Note though that you might need to transpose your result before depending on if you have column or row vectors in your cell. The size vector (or matrix) to combine need to match/align.)
Another way as to use parentheses (as above in my solution). Here a subset of the original cell is return. Therefore it goes directly into the cell2matfunction.
extInt(vec)
ans =
[1] [1x3 double]
I have been messing around and I got this working by converting this entry into a new cell array and transposing it so the dimensions remained equivalent for the concatenating process
Output = cell2mat({extInt{vec}}')
use
Output = cell2mat(extInt(vec))
Since you want to address the cells in extInt not the content of the cells
extInt(vec)
extInt{vec}
try those to see whats going on

quickly create a cell array with two elements in matlab?

Given two matrices of distinct sizes, say matrices A and B, how to quickly create a cell array to store them? I know how to do this using the standard way as the following.
c = cell(1,2);
c{1}=A,
c{2}=B;
Is there a better way? Basically, what I am asking is to initialize a given cell array quickly in matlab. Many thanks for your time and attention.
You can easily write the statement in one line with C = {A,B}. This creates a cell-array with two columns and one row.
Let's test it with random data:
A = rand(2,2);
B = rand(3,3);
C = {A,B}
This is the result:
C =
[2x2 double] [3x3 double]
In case you need two rows instead of two columns, just change the , to ; like you would do to create a 'normal' matrix.
A = rand(2,2);
B = rand(3,3);
C = {A;B}
This is the result:
C =
[2x2 double]
[3x3 double]
Otherwise you can directly do
C = {A,B};

Convert matrix to cell array of cell arrays

I want to change a matrix N*123456 to a cells of cells, each sub-cell contains a N*L matrix
Eg:
matrixSize= 50*123456
N=50
L=100
Output will be 1*1235 cell and each cell has a 50*L matrix (last cell has only 50*56)
I know there is a function mat2cell in matlab:
Output = mat2cell(x, [50], [100,100,100,......56])
But it doesn't sound an intuitive solution.
So is there a good solution?
If I understand you correctly, assuming your matrix is denoted m, this is what you wanted:
a=num2cell(reshape(m(:,1:size(m,2)-mod(size(m,2),L)),N*L,[]),1);
a=cellfun(#(n) reshape(n,N,L), a,'UniformOutput',false);
a{end+1}=m(:,end-mod(size(m,2),L)+1:end);
(this can be shortened to a single line if you wish)...
Lets test with some minimal numbers:
m=rand(50,334);
N=50;
L=100;
yields:
a =
[50x100 double] [50x100 double] [50x100 double] [50x34 double]
note that I didn't check for the exact dimension in the reshape, so you may need to reshape to ...,[],N*L) etc.
Just use elementary maths.
q = floor(123456/100);
r = rem(123456,100);
Output = mat2cell(x, 50, [repmat(100,1,q),r])

Summing multiple matrices in matlab

I have a file containing 60 matrices. I would like get the mean of each value across those 60 matrices.
so the mean of the [1,1] mean of [1,2] across the matrices.
I am unable to use the mean command and am not sure what's the best way to do this.
Here's the file: https://dl.dropbox.com/u/22681355/file.mat
You can try this:
% concatenate the contents of your cell array to a 100x100x60 matrix
c = cat(3, results_foptions{:});
% take the mean
thisMean = mean(c, 3);
To round to the nearest integer, you can use
roundedMean = round(thisMean);
You should put all the matrices together in a 3 dimensional (matrix?), mat, as:
mat(:,:,1) = mat1;
mat(:,:,2) = mat2;
mat(:,:,3) = mat3;
etc...
then simply:
mean(mat, 3);
where the parameter '3' stipulates that you want the mean accros the 3rd dimension.
The mean of the matrix can be computed a few different ways.
First you can compute the mean of each column and then compute the mean of those means:
colMeans = mean( A );
matMean = mean(colMean);
Or you can convert the matrix to a column vector and compute the mean directly
matMean = mean( A(:) );