MATLAB: Rearranging feature matrix - matlab

I have a feature matrix of size ~1M x 3 where the columns are doc#,wordID#,wordcount
What's a fast way in Matlab to rearrange this feature matrix so it is instead of size #docs x # unique words i.e.
(length(unique(featurematrix(:,1))) x length(unique(featurematrix(:,2)))
so that each row instead represents an entire document, each column represents a different word, and the values are the wordcounts from the 3rd column of the original matrix?
I started writing a bunch of loops, but had the feeling there's probably some short idiomatic way to do this already built-in to Matlab.

You can actually use accumarray to accomplish this
data = [1, 1, 1;
1, 2, 2;
1, 5, 3;
2, 1, 4;
2, 3, 5];
result = accumarray(data(:,1:2), data(:,3))
% 1 2 0 0 3
% 4 0 5 0 0
Alternately you could use sparse
result = full(sparse(data(:,1), data(:,2), data(:,3)))

Related

assign new matrix values based on row and column index vectors

New to MatLab here (R2015a, Mac OS 10.10.5), and hoping to find a solution to this indexing problem.
I want to change the values of a large 2D matrix, based on one vector of row indices and one of column indices. For a very simple example, if I have a 3 x 2 matrix of zeros:
A = zeros(3, 2)
0 0
0 0
0 0
I want to change A(1, 1) = 1, and A(2, 2) = 1, and A(3, 1) = 1, such that A is now
1 0
0 1
1 0
And I want to do this using vectors to indicate the row and column indices:
rows = [1 2 3];
cols = [1 2 1];
Is there a way to do this without looping? Remember, this is a toy example that needs to work on a very large 2D matrix. For extra credit, can I also include a vector that indicates which value to insert, instead of fixing it at 1?
My looping approach is easy, but slow:
for i = 1:length(rows)
A(rows(i), cols(i)) = 1;
end
sub2ind can help here,
A = zeros(3,2)
rows = [1 2 3];
cols = [1 2 1];
A(sub2ind(size(A),rows,cols))=1
A =
1 0
0 1
1 0
with a vector to 'insert'
b = [1,2,3];
A(sub2ind(size(A),rows,cols))=b
A =
1 0
0 2
3 0
I found this answer online when checking on the speed of sub2ind.
idx = rows + (cols - 1) * size(A, 1);
therefore
A(idx) = 1 % or b
5 tests on a big matrix (~ 5 second operations) shows it's 20% faster than sub2ind.
There is code for an n-dimensional problem here too.
What you have is basically a sparse definition of a matrix. Thus, an alternative to sub2ind is sparse. It will create a sparse matrix, use full to convert it to a full matrix.
A=full(sparse(rows,cols,1,3,2))

confused about the output of hist

I am confused by the
[m,n]=hist(y,x)
such as
M = [1, 2, 3;
4, 5, 6;
1, 2, 3];
[m,n] = hist(M,1:3)
Which results in
m = 2 0 0
0 2 0
1 1 3
Can someone please explain how m is calculated?
hist actually takes vectors as input arguments, you wrote a matrix, so it just handles your input as if it was several vector-inputs. The output are the number of elements for each container (in your case 1:3, the second argument).
[m,n] = hist([1,2,3;4,5,6;1,2,3],1:3)
treats each column as one input. You put in 3 inputs (# of columns) and you get 3 outputs.
[2 0 1]'
means, for the input [1;4;1] and the bin 1:3 two elements are in bin 1 and one element is in bin 3.
Look at the last column of m, here all three values are in the third bin, which makes sense, since the corresponding vector is [3;6;3], and out of those numbers all have to go into the bin/container 3.

Eliminating zeros in a matrix - Matlab

Hi I have the following matrix:
A= 1 2 3;
0 4 0;
1 0 9
I want matrix A to be:
A= 1 2 3;
1 4 9
PS - semicolon represents the end of each column and new column starts.
How can I do that in Matlab 2014a? Any help?
Thanks
The problem you run into with your problem statement is the fact that you don't know the shape of the "squeezed" matrix ahead of time - and in particular, you cannot know whether the number of nonzero elements is a multiple of either the rows or columns of the original matrix.
As was pointed out, there is a simple function, nonzeros, that returns the nonzero elements of the input, ordered by columns. In your case,
A = [1 2 3;
0 4 0;
1 0 9];
B = nonzeros(A)
produces
1
1
2
4
3
9
What you wanted was
1 2 3
1 4 9
which happens to be what you get when you "squeeze out" the zeros by column. This would be obtained (when the number of zeros in each column is the same) with
reshape(B, 2, 3);
I think it would be better to assume that the number of elements may not be the same in each column - then you need to create a sparse array. That is actually very easy:
S = sparse(A);
The resulting object S is a sparse array - that is, it contains only the non-zero elements. It is very efficient (both for storage and computation) when lots of elements are zero: once more than 1/3 of the elements are nonzero it quickly becomes slower / bigger. But it has the advantage of maintaining the shape of your matrix regardless of the distribution of zeros.
A more robust solution would have to check the number of nonzero elements in each column and decide what the shape of the final matrix will be:
cc = sum(A~=0);
will count the number of nonzero elements in each column of the matrix.
nmin = min(cc);
nmax = max(cc);
finds the smallest and largest number of nonzero elements in any column
[i j s] = find(A); % the i, j coordinates and value of nonzero elements of A
nc = size(A, 2); % number of columns
B = zeros(nmax, nc);
for k = 1:nc
B(1:cc(k), k) = s(j == k);
end
Now B has all the nonzero elements: for columns with fewer nonzero elements, there will be zero padding at the end. Finally you can decide if / how much you want to trim your matrix B - if you want to have no zeros at all, you will need to trim some values from the longer columns. For example:
B = B(1:nmin, :);
Simple solution:
A = [1 2 3;0 4 0;1 0 9]
A =
1 2 3
0 4 0
1 0 9
A(A==0) = [];
A =
1 1 2 4 3 9
reshape(A,2,3)
ans =
1 2 3
1 4 9
It's very simple though and might be slow. Do you need to perform this operation on very large/many matrices?
From your question it's not clear what you want (how to arrange the non-zero values, specially if the number of zeros in each column is not the same). Maybe this:
A = reshape(nonzeros(A),[],size(A,2));
Matlab's logical indexing is extremely powerful. The best way to do this is create a logical array:
>> lZeros = A==0
then use this logical array to index into A and delete these zeros
>> A(lZeros) = []
Finally, reshape the array to your desired size using the built in reshape command
>> A = reshape(A, 2, 3)

How can I create systematic matrices in MATLAB?

I'm setting up a script and I want it to systematically go through ALL POSSIBLE 2x2, 3x3, and 4x4 matrices modulo 2, 3, 4, 5, 6, and 7. For example, for modulus 2 in a 2x2, there would be 16 possibilities (4^2 because there are 4 positions with 2 possibilities each). I'm having trouble getting MATLAB to not only form all these possibilities but to put them through my script one at a time. Any thoughts?
Thanks!
This solution uses allcomb from matlab file exchange.
%size
n=2
%maximum value
m=2
%generate input for allcomb
e=cell(1,n^2)
e(1:end)={[0:m-1]}
%generate all combinations.
F=reshape(allcomb(e{:}),[],n,n)
F is a 3D-Matrix, to get the first possibility use:
squeeze(F(1,:,:))
A slight generalization of this Q&A does the job in one line:
r = 2; %// number of rows
c = 2; %// number of columns
n = 2; %// considered values: 0, 1, ..., n-1
M = reshape(dec2base(0:n^(r*c)-1, n).' - '0', r,c,[]);
Result for r, c, n as above:
M(:,:,1) =
0 0
0 0
M(:,:,2) =
0 0
0 1
...
M(:,:,16) =
1 1
1 1

how to Conditionally add values to a matrix without using a for loop?

I have written a for loop code and I want to write in more succinct way without using a for loop, but instead use matrix conditional.
I am teaching myself matlab and I would appreciate any feedback.
I want to create a new matrix, the first column is y, and the second column is filled with zero except for the y's whose indices are contained in the indices matrix. And in the latter case, add 1 instead of 0.
Thanks.
y=[1;2;3;4;5;6;7];
indices=[1;3;5];
[m,n]=size(y);
tem=zeros(m,1);
data=[y,tem];
[r,c]=size(indices);
for i=1:r
a=indices(i);
data(a,2 )=1;
end
Output:
data =
1 1
2 0
3 1
4 0
5 1
6 0
7 0
A shorter alternative:
data = [y(:), full(sparse(indices, 1, 1, numel(y), 1))];
The resulting matrix data is composed of two column vectors: y(:) and a sparse array, with "1"s at the positions corresponding to indices.
Using proper initialization and sparse matrices can be really useful in MATLAB.
How about
data = zeros( m, 2 );
data(:,1) = y;
data( indices, 2 ) = 1;