Let label be a matrix of size N x 1 (type double) and data be a matrix of size N x M (type double). The entries in the Label matrix looks like [ 1; 23; 135; ....; 6] which conveys that the
First row in the data matrix belongs to label 1
Second row in the data matrix belongs to label 2 and label 3
Third row in the data matrix belongs to label 1, label 3 and label 5 and so on
I would like to create a cell array say Individual{i} which stores all those rows from the data matrix which belongs to label i as given by the label matrix.
The resultant Individual{i} matrix will be size N_i x M.
Is there any efficient way to do the thing rather than looping row by row of data and label matrix?
I would turn your matrix label into a Boolean matrix L:
L = [ 1 0 0 0 0 0 ;
0 1 1 0 0 0 ;
1 0 1 0 1 0 ;
...
0 0 0 0 0 1 ];
for your example. You can use a sparse matrix if N or the number of labels is very large.
Then I think what you call N_i is sum(L(:, i)) and L' * data would compute the sum of all the rows in data with label L.
What do you want to do with the data once it's reached the Individual cell array? There's almost certainly a better way to do it...
Given the correct variables: N, M, data, label as you described, here's a sample code that creates the desired cell array Individual:
%# convert labels to binary-encoded format (as suggested by #Tom)
maxLabels = 9; %# maximum label number possible
L = false(N,maxLabels);
for i=1:N
%# extract digits of label
digits = sscanf(num2str(label(i)),'%1d');
%# all digits should be valid label indices
%assert( all(digits>=1) && all(digits<=maxLabels) );
%# mark this row as belong to designated labels
L(i,digits) = true;
end
%# distribute data rows according to labels
individual = cell(maxLabels,1);
for i=1:maxLabels
individual{i} = data(L(:,i),:);
end
Related
I have a matrix 'Z' sized 100000x2 and imported as an Excel file using readmatrix. I have a created array 'Time' (Time = [-200:0.1:300]'). I would like to compare all values in column 1 of 'Z' to 'Time' and eliminate all values of column 1 of 'Z' that do not equal a value of 'Time', thus shortening my 'Z' matrix to match my desired time values. Column 2 are pressure traces, so this would give me my desired time values and the corresponding pressure trace.
This sort of thing can be done without loops:
x = [1,2,3,4,1,1,2,3,4];
x = [x', (x+1)'] % this is your 'Z' data from the excel file (toy example here)
x =
1 2
2 3
3 4
4 5
1 2
1 2
2 3
3 4
4 5
y = [1,2]; % this is your row of times you want eliminated
z = x(:,1)==y % create a matrix logical arrays indicating the matches in the first column
z =
9×2 logical array
1 0
0 1
0 0
0 0
1 0
1 0
0 1
0 0
0 0
z = z(:,1)+z(:,2); % there is probably another summing technique that is better for your case
b = [x(z~=1,1), x(z~=1,2)] % use matrix operations to extract the desired rows
b =
3 4
4 5
3 4
4 5
All the entries of x where the first column did not equal 1 or 2 are now gone.
x = ismember(Z(:,1),Time); % logical indexes of the rows you want to keep
Z(~x,:) = []; % get rid of the other rows
Or instead of shortening Z you could create a new array to use downstream in your code:
x = ismember(Z(:,1),Time); % logical indexes of the rows you want to keep
Znew = Z(x,:); % the subset you want
You have to loop over all rows, use a nested if statement to check the item, and delete the row if it doesn't match.
Syntax for loops:
for n = 1:100000:
//(operation)//
end
Syntax for if statements:
if x == y
//(operation)//
Syntax for deleting a row: Z(rownum,:) = [];
I have a matrix a in Matlab that looks like the following:
a = zeros(10,3);
a(3:6,1)=2; a(5:9,3)=1; a(5:7,2)=3; a(8:10,1)=2;
a =
0 0 0
0 0 0
2 0 0
2 0 0
2 3 1
2 3 1
0 3 1
2 0 1
2 0 1
2 0 0
I would like to obtain a cell array with the number of times that each number appears in a column. Also, it should be ordered depending on the element value, regardless of the column number. In the example above I would like to obtain the cell:
b = {[5],[4,3],[3]}
Because the number 1 appears once for 5 times, the number 2 twice in blocks of 4 and 3, and the number 3 once for 3 times. As you can see the recurrences are ordered according to the element value and not to the number of the column where the elements appear.
Since you're not concerned with the column, you can string all the columns into a single column vector, padding with zeroes on either end to prevent spans at the start and end of columns from running together:
v = reshape(padarray(a, [1 0]), [], 1);
% Or if you don't have the Image Processing Toolbox function padarray...
v = reshape([zeros(1, size(a, 2)); a; zeros(1, size(a, 2))], [], 1);
Now, assuming spans are always separated by 1 or more zeroes, you can find the length of each span as follows:
endPoints = find(diff(v) ~= 0); % Find where transitions to or from 0 occur
spans = endPoints(2:2:end)-endPoints(1:2:end); % Index of transitions to 0 minus
% index of transitions from 0
And finally you can accumulate the spans based on the value present in those spans:
b = accumarray(v(endPoints(1:2:end)+1), spans, [], #(v) {v(:).'}).';
And for your example:
b =
1×3 cell array
[5] [1×2 double] [3]
Note:
The ordering of values in the resulting cell array is not guaranteed to match the order in spans (i.e. b{2} above is [3 4] instead of [4 3]). If order matters, you'll need to sort your subscripts as per this section of the documentation. Here's how you would change the computation of b:
[vals, index] = sort(v(endPoints(1:2:end)+1));
b = accumarray(vals, spans(index), [], #(v) {v(:).'}).';
The hard part is finding and separating the blocks. diff will find the starting point of any run of numbers, which is the starting point for this solution:
b = [zeros(1,size(a,2)); a; zeros(1,size(a,2))];
idx = diff(b)~=0;
block_values = b(idx);
block_lengths = diff([0; find(idx)]);
Now we have two vectors of the values of each block, and how long they are, and they just need to be captured in the cell array, ignoring the zero blocks
c = accumarray(block_values(block_values~=0), block_lengths(block_values~=0), [], #(x) {x}).';
b = {}
for i = 1:ncolumns
for n = 1:nnumbers
b{i}(n) = sum(a(:,i) == n)
end
end
(Note that this places zeros for numbers at which the count is 0, but otherwise I don't see how else you would be able to recognize which value is being counted)
i have [sentences*words] matrix as shown below
out = 0 1 1 0 1
1 1 0 0 1
1 0 1 1 0
0 0 0 1 0
i want to process this matrix in a way that should tell W1 & W2 in "sentence number 2" and "sentence number 4" occurs with same value i.e 1 1 and 0 0.the output should be as follows:
output{1,2}= 2 4
output{1,2} tells word number 1 and 2 occurs in sentence number 2 and 4 with same values.
after comparing W1 & W2 next candidate should be W1 & W3 which occurs with same value in sentence 3 & sentence 4
output{1,3}= 3 4
and so on till every nth word is compared with every other words and saved.
This would be one vectorized approach -
%// Get number of columns in input array for later usage
N = size(out,2);
%// Get indices for pairwise combinations between columns of input array
[idx2,idx1] = find(bsxfun(#gt,[1:N]',[1:N])); %//'
%// Get indices for matches between out1 and out2. The row indices would
%// represent the occurance values for the final output and columns for the
%// indices of the final output.
[R,C] = find(out(:,idx1) == out(:,idx2))
%// Form cells off each unique C (these will be final output values)
output_vals = accumarray(C(:),R(:),[],#(x) {x})
%// Setup output cell array
output = cell(N,N)
%// Indices for places in output cell array where occurance values are to be put
all_idx = sub2ind(size(output),idx1,idx2)
%// Finally store the output values at appropriate indices
output(all_idx(1:max(C))) = output_vals
You can get a logical matrix of size #words-by-#words-by-#sentences easily using bsxfun:
coc = bsxfun( #eq, permute( out, [3 2 1]), permute( out, [2 3 1] ) );
this logical array is occ( wi, wj, si ) is true iff word wi and word wj occur in sentence si with the same value.
To get the output cell array from coc you need
nw = size( out, 2 ); %// number of words
output = cell(nw,nw);
for wi = 1:(nw-1)
for wj = (wi+1):nw
output{wi,wj} = find( coc(wi,wj,:) );
output{wj,wi} = output{wi,wj}; %// you can force it to be symmetric if you want
end
end
I'm working in Matlab and I have the next problem:
I have a B matrix of nx2 elements, which contains indexes for the assignment of a big sparse matrix A (almost 500,000x80,000). For each row of B, the first column is the column index of A that has to contain a 1, and the second column is the column index of A that has to contain -1.
For example:
B= 1 3
2 5
1 5
4 1
5 2
For this B matrix, The Corresponding A matrix has to be like this:
A= 1 0 -1 0 0
0 1 0 0 -1
1 0 0 0 -1
-1 0 0 1 0
0 -1 0 0 1
So, for the row i of B, the corresponding row i of A must be full of zeros except on A(i,B(i,1))=1 and A(i,B(i,2))=-1
This is very easy with a for loop over all the rows of B, but it's extremely slow. I also tried the next formulation:
A(:,B(:,1))=1
A(:,B(:,2))=-1
But matlab gave me an "Out of Memory Error". If anybody knows a more efficient way to achieve this, please let me know.
Thanks in advance!
You can use the sparse function:
m = size(B,1); %// number of rows of A. Or choose larger if needed
n = max(B(:)); %// number of columns of A. Or choose larger if needed
s = size(B,1);
A = sparse(1:s, B(:,1), 1, m, n) + sparse(1:s, B(:,2), -1, m, n);
I think you should be able to do this using the sub2ind function. This function converts matrix subscripts to linear indices. You should be able to do it like so:
pind = sub2ind(size(A),1:n,B(:,1)); % positive indices
nind = sub2ind(size(A),1:n,B(:,2)); % negative indices
A(pind) = 1;
A(nind) = -1;
EDIT: I (wrongly, I think) assumed the sparse matrix A already existed. If it doesn't exist, then this method wouldn't be the best option.
I have a matrix A with integer elements from 0 to N-1.
What I need to get is vector V of length N which for each position "i" will contain number of elements equal to "i" in matrix A.
For example:
N = 6
A:
0 0 1
1 2 3
3 5 0
V:
3 2 1 2 0 1 0
What is the efficient way to do this?
My real matrix is about 10K x 10K elements and N is about 100.
Use v = histc(A(:), 0:(N-1)). To get exactly your result, perform v = v'.
You want to use histc (after reshape to convert to a vector)
n = histc(x,edges) counts the number of values in vector x that fall
between the elements in the edges vector (which must contain
monotonically nondecreasing values). n is a length(edges) vector
containing these counts.
V = histc(reshape(A,1,[]), 0:(N-1) );