How to find number of occurrences of a subset of elements in a vector without using loops in MATLAB? - matlab

Say X is the given vector:
X=[1
2
4
2
3
1
4
5
2
4
5];
And Y is the given subset of elements from X:
Y=[3
4
5];
The required output is the number of times the elements in Y occur in X:
out=[1
3
2];
My solution to do this would be to use for loop:
for i=1:size(X,1)
temp = X(X(:,1)==Y(i,1),:);
out(i,1) = size(temp,1);
end
But when X and Y are large, this is inefficient. So, how to do it faster making use of vectorization? I know about hist and histc, but I can't think of how to use them in this case to get the desired output.

A Fast Option
You could use bsxfun combined with sum to compute this
sum(bsxfun(#eq, Y, X.'), 2)
Explanation
In this example, bsxfun performs a given operation on every combination of elements in X and Y. The operation we're gonig to use is eq (a check for equality). The result is a matrix that has a row for each element in Y and a column for each element in X. It will have a 1 value if the element in X equals the element in Y that corresponds to a given row.
bsxfun(#eq, Y, X.')
% 0 0 0 0 1 0 0 0 0 0 0
% 0 0 1 0 0 0 1 0 0 1 0
% 0 0 0 0 0 0 0 1 0 0 1
We can then sum across the columns to count the number of elements in X that were equal to a given value in Y.
sum(bsxfun(#eq, Y, X.'), 2)
% 1
% 3
% 2
On newer versions of MATLAB (since R2016b), you can omit the bsxfun since the equality operation will automatically broadcast.
sum(Y - X.', 2)
A Memory-Efficient Option
The first option isn't the most efficient since it requires creating a matrix that is [numel(Y), numel(X)] elements large. Another way which may be more memory efficient may be to use the second output of ismember combined with accumarray
[tf, ind] = ismember(X, Y);
counts = accumarray(ind(tf), ones(sum(tf), 1), [numel(Y), 1], #numel);
Explanation
ismember is used to determine if the values in one array are in another. The first input tells us if each element of the first input is in the second input and the second output tells you where in the second input each element of the first input was found.
[tf, ind] = ismember(X, Y);
% 0 0 1 0 1 0 1 1 0 1 1
% 0 0 2 0 1 0 2 3 0 2 3
We can use the second input to "group" the same values together. The accumarray function does exactly this, it uses the ind variable above to determine groups and then applies a given operation to each group. In our case, we want to simply determine the number of element within each group. So to do that we can pass a second input the size of the ind input (minus the ones that didn't match) of ones, and then use numel as the operation (counts the number in each group)
counts = accumarray(ind(tf), ones(sum(tf), 1), [numel(Y), 1], #numel);
% 1
% 3
% 2

Related

How does Y = eye(K)(y, :); replace a "for" loop? Coursera

Working on an assignment from Coursera Machine Learning. I'm curious how this works... From an example, this much simpler code:
% K is the number of classes.
K = num_labels;
Y = eye(K)(y, :);
seems to be a substitute for the following:
I = eye(num_labels);
Y = zeros(m, num_labels);
for i=1:m
Y(i, :)= I(y(i), :);
end
and I have no idea how. I'm having some difficulty Googling this info as well.
Thanks!
Your variable y in this case must be an m-element vector containing integers in the range of 1 to num_labels. The goal of the code is to create a matrix Y that is m-by-num_labels where each row k will contain all zeros except for a 1 in column y(k).
A way to generate Y is to first create an identity matrix using the function eye. This is a square matrix of all zeroes except for ones along the main diagonal. Row k of the identity matrix will therefore have one non-zero element in column k. We can therefore build matrix Y out of rows indexed from the identity matrix, using y as the row index. We could do this with a for loop (as in your second code sample), but that's not as simple and efficient as using a single indexing operation (as in your first code sample).
Let's look at an example (in MATLAB):
>> num_labels = 5;
>> y = [2 3 3 1 5 4 4 4]; % The columns where the ones will be for each row
>> I = eye(num_labels)
I =
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 1
>> Y = I(y, :)
Y =
% 1 in column ...
0 1 0 0 0 % 2
0 0 1 0 0 % 3
0 0 1 0 0 % 3
1 0 0 0 0 % 1
0 0 0 0 1 % 5
0 0 0 1 0 % 4
0 0 0 1 0 % 4
0 0 0 1 0 % 4
NOTE: Octave allows you to index function return arguments without first placing them in a variable, but MATLAB does not (at least, not very easily). Therefore, the syntax:
Y = eye(num_labels)(y, :);
only works in Octave. In MATLAB, you have to do it as in my example above, or use one of the other options here.
The first set of code is Octave, which has some additional indexing functionality that MATLAB does not have. The second set of code is how the operation would be performed in MATLAB.
In both cases Y is a matrix generated by re-arranging the rows of an identity matrix. In both cases it may also be posible to calculate Y = T*y for a suitable linear transformation matrix T.
(The above assumes that y is a vector of integers that are being used as an indexing variables for the rows. If that's not the case then the code most likely throws an error.)

Inserting columns to a matrix in Matlab

I'd like to insert columns to a matrix, but the insertion column positions within the matrix differ by row. How can I do this without using for-loop?
Following is a simplified example in MATLAB;
From A,X,P, I want to get APX without using for-loop.
>> A = zeros(4,5) % inclusive matrix
A =
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
>> X = [9,8;5,7;8,3;6,7] % data to insert
X =
9 8
5 7
8 3
6 7
>> P = [3;2;4;1] % insertion position within the matrix
P =
3
2
4
1
>> APX = [0,0,9,8,0;0,5,7,0,0;0,0,0,8,3;6,7,0,0,0] % what I want
APX =
0 0 9 8 0
0 5 7 0 0
0 0 0 8 3
6 7 0 0 0
It's simply determining the right column-major indices to access the matrix so you can populate it with your desired values. This first requires generating the right row and column values to access the right positions in APX so you can use X to populate those positions.
Using P, each element tells you which column you should start populating for each row of X. You will need to generate column indices in increasing order up to as many columns as there are in X. To generate the row indices, simply create a matrix that is the same size as X where each column spans from 0 up to as many rows as there are in X minus 1 (i.e. 0:size(X,2)-1). This matrix gives you the correct offsets so that you can take P and add it with this matrix. Once you do that you will have a column index matrix that tells you specifically where each element should go with regards to the columns of the output matrix per row of P. Finally, use sub2ind to generate the column-major indices using the rows and columns generated above to place X in APX.
In other words:
P = [3;2;4;1];
X = [9,8;5,7;8,3;6,7];
rowInd = repmat((1:size(X,1)).', 1, size(X,2)); %'
colInd = bsxfun(#plus, P, 0:size(X,2)-1);
APX = zeros(size(X,1), max(colInd(:)));
APX(sub2ind(size(APX), rowInd, colInd)) = X;
To generate the row locations, we use repmat to create a matrix that is the same size as X where each column spans from 1 up to as many rows as X. To generate the column locations, we use bsxfun to create a matrix where each column is the vector P but increasing by 1 per column. We then create APX to be of compatible size then use sub2ind to finally populate the matrix.
With your above test inputs, we get:
APX =
0 0 9 8 0
0 5 7 0 0
0 0 0 8 3
6 7 0 0 0
Minor Note
You really should actually try using loops before trying it vectorized. Though using loops was slow in previous versions of MATLAB, MATLAB R2015b has an improved JIT engine where loops are now competitive. You should time your code using loops and ensuring that it is justifiable before switching to vectorized implementations.

Finding the column index for the 1 in each row of a matrix

I have the following matrix in Matlab:
M = [0 0 1
1 0 0
0 1 0
1 0 0
0 0 1];
Each row has exactly one 1. How can I (without looping) determine a column vector so that the first element is a 2 if there is a 1 in the second column, the second element is a 3 for a one in the third column etc.? The above example should turn into:
M = [ 3
1
2
1
3];
You can actually solve this with simple matrix multiplication.
result = M * (1:size(M, 2)).';
3
1
2
1
3
This works by multiplying your M x 3 matrix with a 3 x 1 array where the elements of the 3x1 are simply [1; 2; 3]. Briefly, for each row of M, element-wise multiplication is performed with the 3 x 1 array. Only the 1's in the row of M will yield anything in the result. Then the result of this element-wise multiplication is summed. Because you only have one "1" per row, the result is going to be the column index where that 1 is located.
So for example for the first row of M.
element_wise_multiplication = [0 0 1] .* [1 2 3]
[0, 0, 3]
sum(element_wise_multiplication)
3
Update
Based on the solutions provided by #reyryeng and #Luis below, I decided to run a comparison to see how the performance of the various methods compared.
To setup the test matrix (M) I created a matrix of the form specified in the original question and varied the number of rows. Which column had the 1 was chosen randomly using randi([1 nCols], size(M, 1)). Execution times were analyzed using timeit.
When run using M of type double (MATLAB's default) you get the following execution times.
If M is a logical, then the matrix multiplication takes a hit due to the fact that it has to be converted to a numerical type prior to matrix multiplication, whereas the other two have a bit of a performance improvement.
Here is the test code that I used.
sizes = round(linspace(100, 100000, 100));
times = zeros(numel(sizes), 3);
for k = 1:numel(sizes)
M = generateM(sizes(k));
times(k,1) = timeit(#()M * (1:size(M, 2)).');
M = generateM(sizes(k));
times(k,2) = timeit(#()max(M, [], 2), 2);
M = generateM(sizes(k));
times(k,3) = timeit(#()find(M.'), 2);
end
figure
plot(range, times / 1000);
legend({'Multiplication', 'Max', 'Find'})
xlabel('Number of rows in M')
ylabel('Execution Time (ms)')
function M = generateM(nRows)
M = zeros(nRows, 3);
col = randi([1 size(M, 2)], 1, size(M, 1));
M(sub2ind(size(M), 1:numel(col), col)) = 1;
end
You can also abuse find and observe the row positions of the transpose of M. You have to transpose the matrix first as find operates in column major order:
M = [0 0 1
1 0 0
0 1 0
1 0 0
0 0 1];
[out,~] = find(M.');
Not sure if this is faster than matrix multiplication though.
Yet another approach: use the second output of max:
[~, result] = max(M.', [], 1);
Or, as suggested by #rayryeng, use max along the second dimension instead of transposing M:
[~, result] = max(M, [], 2);
For
M = [0 0 1
1 0 0
0 1 0
1 0 0
0 0 1];
this gives
result =
3 1 2 1 3
If M contains more than one 1 in a given row, this will give the index of the first such 1.

how to convert edge list to adjacency matrix

I have this code
function adj=edgeL2adjj(e)
Av = [e; fliplr(e)];
nodes = unique(Av(:, 1:2)); % get all nodes, sorted
adj = zeros(numel(nodes)); % initialize adjacency matrix
% across all edges
for i=1:size(Av,1)
adj(nodes==Av(i,1),(nodes==Av(i,2))) = 1;
end
end
to convert an edge list to an adjacency matrix but if I input u=[8 5;1 4;3 5;6 7]
and then I divide u into two set [8 5;1 4], [3 5,6 7] and apply previous code on [3 5;6 7] I will get a 7 x 7 matrix.
but I want a 8 x 8 matrix to any input.
You have a 7x7 matrix because numel(nodes)=7. Indeed node #2 is not present in u. I suggest you to give this function a second input, that is the maximum number of nodes in your network (in this case, 8) and preallocate the adjacency matrix with such input parameter instead of numel(nodes). Or, alternatively, you can preallocate a zeros() square matrix by giving in input not numel(nodes), but the maximum value in input u. The first option will make your code more robust, the second option will make your code robust as long as the 8-th node is in input u.
Also, there is no need for the fliplr(): if your graph is undirected (that is, the adjacency matrix is symmetric) you can rely on such structure inside the for-loop, without concatenation in Av.
Such function can indeed be simplified as follows:
function adj=edgeL2adjj(e)
adj=zeros(max(max(e))); % initialize adjacency matrix
% across all edges
for i=1:size(e,1)
adj(e(i,1),e(i,2))=1;
adj(e(i,2),e(i,1))=1;
end
end
If the input e is your matrix:
e= [8 5;1 4;3 5;6 7];
such code returns
adj =
0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0
1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 1
0 0 0 0 0 0 1 0
0 0 0 0 0 1 0 0
0 0 0 0 1 0 0 0
As you can see you simply exploit the input e. The maximum value is 8, so you build a square 8x8 matrix. Also inside the for-loop by swapping the column indices 1 and 2 in e, you automatically take care of the symmetric structure of the adjacency matrix. Finally, this code automatically sort you out in case of missing nodes: indeed, as you can see, the second row and the second column in adj are all-zeros because node #2 is not present in the edge list e.
Note: I do not recommend to split your input edge list. In this manner, you will loose all the global informations regarding your network. Indeed you will have (let's say) two "sub-networks" which you later have to concatenate (in terms of adjacency matrix). That is, if you split e in two sub-matrices, you'll have two adjacency matrices. By taking into account the code as it is now in my answer, the first matrix will be 8x8 whereas the second will be 7x7 because max(max([3 5;6 7]))=7.
If I may, I'd like to offer an altogether different solution to your problem, which avoids looping over all edges (always a good idea to avoid loops in MATLAB!).
Since you have all your row and column indices, your edge list basically already specifies a (albeit sparse) matrix. You can simply allocate the matrix like this:
function adj = edgeL2adjj(e)
r = e(:,1);
c = e(:,2);
vals = ones(size(r));
% maximum node index determines matrix dimensions
n_nodes = max(e(:));
% create sparse matrix
adj = sparse(r, c, vals, n_nodes, n_nodes);
% if really necessary, you can just convert it to a full matrix
adj = full(adj);
end
If you know you always want the graph to have the same dimensions, even if you only use part of your edges as an input to the function, you could just pass n_nodes as an input instead of determining it from e:
function adj = edgeL2adjj(e, n_nodes)

Finding all possible “lists” of possible pairs in Matlab

I have been thinking about a problem for the last few days but as I am a beginner in MATLAB, I have no clue how to solve it. Here is the background. Suppose that you have a symmetric N×N matrix where each element is either 0 or 1, and N = (1,2,...,n).
For example:
A =
0 1 1 0
1 0 0 1
1 0 0 0
0 1 0 0
If A(i,j) == 1, then it is possible to form the pair (i,j) and if A(i,j)==0 then it is NOT possible to form the pair (i,j). For example, (1,2) is a possible pair, as A(1,2)==A(2,1)==1 but (3,4) is NOT a possible pair as A(3,4)==A(4,3)==0.
Here is the problem. Suppose that a member of the set N only can for a pair with at most one other distinct member of the set N (i.e., if 1 forms a pair with 2, then 1 cannot form a pair with 3). How can I find all possible “lists” of possible pairs? In the above example, one “list” would only consist of the pair (1,2). If this pair is formed, then it is not possible to form any other pairs. Another “list” would be: ((1,3),(2,4)). I have searched the forum and found that the latter “list” is the maximal matching that can be found, e.g., by using a bipartite graph approach. However, I am not necessarily only interested to find the maximal matching; I am interested in finding ALL possible “lists” of possible pairs.
Another example:
A =
0 1 1 1
1 0 0 1
1 0 0 0
1 1 0 0
In this example, there are three possible lists:
(1,2)
((1,3),(2,4))
(1,4)
I hope that you can understand my question, and I apologize if am unclear. I appreciate all help I can get. Many thanks!
This might be a fast approach.
Code
%// Given data, A
A =[ 0 1 1 1;
1 0 0 1;
1 0 0 0;
1 1 0 0];
%%// The lists will be stored in 'out' as a cell array and can be accessed as out{1}, out{2}, etc.
out = cell(size(A,1)-1,1);
%%// Code that detects the lists using "selective" diagonals
for k = 1:size(A,1)-1
[x,y] = find(triu(A,k).*(~triu(ones(size(A)),k+1)));
out(k) = {[x y]};
end
out(cellfun('isempty',out))=[]; %%// Remove empty lists
%%// Verification - Print out the lists
for k = 1:numel(out)
disp(out{k})
end
Output
1 2
1 3
2 4
1 4
EDIT 1
Basically I will calculate all the the pairwise indices of the matrix to satisfy the criteria set in the question and then simply map them over the given matrix. The part of finding the "valid" indices is obviously the tedious part in it and in this code with some aggressive approach is expensive too when dealing with input matrices of sizes more than 10.
Code
%// Given data, A
A = [0 1 1 1; 1 0 1 1; 1 1 0 1; 1 1 1 0]
%%// Get all pairwise combinations starting with 1
all_combs = sortrows(perms(1:size(A,1)));
all_combs = all_combs(all_combs(:,1)==1,:);
%%// Get the "valid" indices
all_combs_diff = diff(all_combs,1,2);
valid_ind_mat = all_combs(all(all_combs_diff(:,1:2:end)>0,2),:);
valid_ind_mat = valid_ind_mat(all(diff(valid_ind_mat(:,1:2:end),1,2)>0,2),:);
%%// Map the ones of A onto the valid indices to get the lists in a matrix and then cell array
out_cell = mat2cell(valid_ind_mat,repmat(1,[1 size(valid_ind_mat,1)]),repmat(2,[1 size(valid_ind_mat,2)/2]));
A_masked = A(sub2ind(size(A),valid_ind_mat(:,1:2:end),valid_ind_mat(:,2:2:end)));
out_cell(~A_masked)={[]};
%%// Remove empty lists
out_cell(all(cellfun('isempty',out_cell),2),:)=[];
%%// Verification - Print out the lists
disp('Lists =');
for k1 = 1:size(out_cell,1)
disp(strcat(' List',num2str(k1),':'));
for k2 = 1:size(out_cell,2)
if ~isempty(out_cell{k1,k2})
disp(out_cell{k1,k2})
end
end
end
Output
A =
0 1 1 1
1 0 1 1
1 1 0 1
1 1 1 0
Lists =
List1:
1 2
3 4
List2:
1 3
2 4
List3:
1 4
2 3
I'm sure there's a faster way to do it, but here's the obvious solution:
%// Set top half to 0, and find indices of all remaining 1's
A(triu(A)==1) = 0;
[ii,jj] = find(A);
%// Put these in a matrix for further processing
P = [ii jj];
%// Sort indices into 'lists' of the kind you defined
X = repmat({}, size(P,1),1);
for ii = 1:size(P,1)-1
X{ii}{1} = P(ii,:);
for jj = ii+1:size(P,1)
if ~any(ismember(P(ii,:), P(jj,:)))
X{ii}{end+1} = P(jj,:); end
end
end