Convert 2D matrix to 3D using two mapping vectors with categorical values in MATLAB - matlab

A similar question probably has been asked already, but I cannot locate it. Something wrong with me today, that I cannot find a good solution to a kind of frequent problem.
I have M x N matrix of doubles, and two N x 1 vectors (cell arrays of strings) with categorical values. The 1st vector contains K unique values and the 2nd - L unique values, such as K * L >= N.
I would like to convert the original matrix to 3D matrix M x K x L. So, to keep the 1st dimension, the 2nd dimension will correspond to unique values in the 1st vector, and 3rd dimension - to unique values in 2nd vector.
Let's consider that vectors are not sorted. It is guaranteed that there are no duplicate combinations of corresponding elements in two vectors. The matrix values for missing combinations can be 0s.
I can convert categorical vectors to numbers with grp2idx, which can be considered as column and page numbers. But how to apply them to the original matrix?
EDIT:
Here is some random data:
A = reshape(1:24,4,6);
v1 = {'s1','s1','s2','s2','s3','s3'}';
v2 = {'t1','t2','t1','t2','t1','t2'}';
idx = randperm(6); %# just to randomize
v1 = v1(idx);
v2 = v2(idx);
[M, N] = size(z); %# M=4, N=6
K = numel(unique(v1)); %# K=3
L = numel(unique(v2)); %# L=2
I need to reshape matrix A to 4x3x2 in such manner that column 1 page 1 will correspond to combination s1_t1, column 2 page 2 to s2_t2, etc. In this example K*L equals N, so all the positions will have the data, but this is not a general case.

You can use accumarray for this. Note that grp2idx would give somewhat unexpected results, since it starts numbering at the first unique element it finds, instead of numbering according to sorted values of v1 or v2. If you don't care about order, you can use e.g. idx2=grp2idx(v1).
idx1 = ndgrid(1:M,1:N); %# rows
[~,idx2]=ismember(v1,unique(v1));
idx2 = repmat(idx2',M,1);
[~,idx3]=ismember(v2,unique(v2));
idx3 = repmat(idx3',M,1);
out = accumarray([idx1(:),idx2(:),idx3(:)],A(:),[M,K,L],#(x)x,0);

Related

Vector and matrix comparison in MATLAB

I have vector with 5 numbers in it, and a matrix of size 6000x20, so every row has 20 numbers. I want to count how many of the 6000 rows contain all values from the vector.
As the vector is a part of a matrix which has 80'000'000 rows, each containing unique combinations, I want a fast solution (which doesn't take more than 2 days).
Thanks
With the sizes you have, a bsxfun-based approach that builds an intermediate 6000x20x5 3D-array is affordable:
v = randi(9,1,5); %// example vector
M = randi(9,6000,20); %// example matrix
t = bsxfun(#eq, M, reshape(v,1,1,[]));
result = sum(all(any(t,2),3));

How to create matrix of nearest neighbours from dataset using matrix of indices - matlab

I have an Nx2 matrix of data points where each row is a data point. I also have an NxK matrix of indices of the K nearest neighbours from the knnsearch function. I am trying to create a matrix that contains in each row the data point followed by the K neighbouring data points, i.e. for K = 2 we would have something like [data1, neighbour1, neighbour2] for each row.
I have been messing round with loops and attempting to index with matrices but to no avail, the fact that each datapoint is 1x2 is confusing me.
My ultimate aim is to calculate gradients to train an RBF network in a similar manner to:
D = (x_dist - y_dist)./(y_dist+(y_dist==0));
temp = y';
neg_gradient = -2.*sum(kron(D, ones(1,2)) .* ...
(repmat(y, 1, ndata) - repmat((temp(:))', ndata, 1)), 1);
neg_gradient = (reshape(neg_gradient, net.nout, ndata))';
You could use something along those lines:
K = 2;
nearest = knnsearch(data, data, 'K', K+1);%// Gets point itself and K nearest ones
mat = reshape(data(nearest.',:).',[],N).'; %// Extracts the coordinates
We generate data(nearest.',:) to get a 3*N-by-2 matrix, where every 3 consecutive rows are the points that correspond to each other. We transpose this to get the xy-coordinates into the same column. (MATLAB is column major, i.e. values in a column are stored consecutively). Then we reshape the data, so every column contains the xy-coordinates of the rows of nearest. So we only need to transpose once more in the end.

Row and Column Indices fo the n largest elements in a matrix

I have a very similar problem to the one solved here:
Get the indices of the n largest elements in a matrix
However this solution converts the matrix to an array and then gives the indices in terms on the new array.
I want the row and column indices of the original matrix for the maximum (and minimum) n values.
If you take the solution in that question for finding the 5 largest unique values
sortedValues = unique(A(:)); %# Unique sorted values
maxValues = sortedValues(end-4:end); %# Get the 5 largest values
maxIndex = ismember(A,maxValues); %# Get a logical index of all values
%# equal to the 5 largest values
You are provided with a logical matrix of those values which match. You can use find to get their indexes and then ind2sub to convert these back to coordinates.
idx = find(maxIndex);
[x y] = ind2sub(size(A), idx);
An alternative, in light of comments:
[foo idx] = sort(A(:), 'descend'); %convert the matrix to a vector and sort it
[x y] = ind2sub(size(A), idx(1:5)); %take the top five values and find the coords
Note: the above method does not eliminate any duplicate values, so for example if you have two elements with the same value it may return both elements, or if they are on the boundary, only one of the two.

How to select values with the higher occurences from several matrices having the same size in matlab?

I would like to have a program that makes the following actions:
Read several matrices having the same size (1126x1440 double)
Select the most occuring value in each cell (same i,j of the matrices)
write this value in an output matrix having the same size 1126x1440 in the corresponding i,j position, so that this output matrix will have in each cell the most occurent value from the same position of all the input matrices.
Building on #angainor 's answer, I think there is a simpler method using the mode function.
nmatrices - number of matrices
n, m - dimensions of a single matrix
maxval - maximum value of an entry (99)
First organize data into a 3-D matrix with dimensions [n X m X nmatrices]. As an example, we can just generate the following random data in a 3-D form:
CC = round(rand(n, m, nmatrices)*maxval);
and then the computation of the most frequent values is one line:
B = mode(CC,3); %compute the mode along the 3rd dimension
Here is the code you need. I have introduced a number of constants:
nmatrices - number of matrices
n, m - dimensions of a single matrix
maxval - maximum value of an entry (99)
I first generate example matrices with rand. Matrices are changed to vectors and concatenated in the CC matrix. Hence, the dimensions of CC are [m*n, nmatrices]. Every row of CC holds individual (i,j) values for all matrices - those you want to analyze.
CC = [];
% concatenate all matrices into CC
for i=1:nmatrices
% generate some example matrices
% A = round(rand(m, n)*maxval);
A = eval(['neurone' num2str(i)]);
% flatten matrix to a vector, concatenate vectors
CC = [CC A(:)];
end
Now we do the real work. I have to transpose CC, because matlab works on column-based matrices, so I want to analyze individual columns of CC, not rows. Next, using histc I find the most frequently occuring values in every column of CC, i.e. in (i,j) entries of all matrices. histc counts the values that fall into given bins (in your case - 1:maxval) in every column of CC.
% CC is of dimension [nmatrices, m*n]
% transpose it for better histc and sort performance
CC = CC';
% count values from 1 to maxval in every column of CC
counts = histc(CC, 1:maxval);
counts have dimensions [maxval, m*n] - for every (i,j) of your original matrices you know the number of times a given value from 1:maxval is represented. The last thing to do now is to sort the counts and find out, which is the most frequently occuring one. I do not need the sorted counts, I need the permutation that will tell me, which entry from counts has the highest value. That is exactly what you want to find out.
% sort the counts. Last row of the permutation will tell us,
% which entry is most frequently found in columns of CC
[~,perm] = sort(counts);
% the result is a reshaped last row of the permutation
B = reshape(perm(end,:)', m, n);
B is what you want.

2d matrix histogram in matlab that interprets each column as a separate element

I have a 128 x 100 matrix in matlab, where each column should be treated as a separate element. Lets call this matrix M.
I have another 128 x 2000 matrix(called V) composed of columns from matrix M.
How would I make a histogram that maps the frequency of each column being used in the second matrix?
hist(double(V),double(M)) gives the error:
Error using histc
Edge vector must be monotonically
non-decreasing.
what should I be doing?
Here is an example. We start with data that resembles what you described
%# a matrix of 100 columns
M = rand(128,100);
sz = size(M);
%# a matrix composed of randomly selected columns of M (with replacement)
V = M(:,randi([1 sz(2)],[1 2000]));
Then:
%# map the columns to indices starting at 1
[~,~,idx] = unique([M,V]', 'rows', 'stable');
idx = idx(sz(2)+1:end);
%# count how many times each column occurs
count = histc(idx, 1:sz(2));
%# plot histogram
bar(1:sz(2), count, 'histc')
xlabel('column index'), ylabel('frequency')
set(gca, 'XLim',[1 sz(2)])
[Lia,Locb] = ismember(A,B,'rows') also returns a vector, Locb,
containing the highest index in B for each row in A that is also a row
in B. The output vector, Locb, contains 0 wherever A is not a row of
B.
ismember with the rows argument can identify which row of one matrix the rows of another matrix come from. Since it works on rows, and you are looking for columns, just transpose both matrices.
[~,Locb]=ismember(V',M');
histc(Locb)