I'm trying to find the first 1 in each column of a matrix without using a for or a while. Say I have
-->A
A =
1. 0. 0. 0.
0. 0. 1. 1.
1. 0. 1. 1.
1. 1. 0. 0.
then I would like to obtain [1,4,2,2] (I can assume there is always a 1 somewhere in each column). The thing is when I use find(A), it gives me [1,3,4,8,10,11,14,15].
I was told not to use loops but matrix operations because scilab handles the last ones better.
Thank you in advance!
With such a small matrix performance could be fast enough with a for loop and probably more readible. But one solution avoiding the use of for loops could be as following.
//Find all rows and cols of the ones in A
[row,col] = find(A);
//Get all row positions that are in a new column index
disp( row( find([1,diff(col)]) ));
I think a more readible solution would be something like the following:
//For each column
for col=1:4
//Find only the first occurence
disp(find(A(:,col),1));
end
As mentioned, with such a small matrix readibility should be a higher priority. You could measure performance of both (or other) solutions by profiling.
If you like to read more about some performance enhancing techniques have a look here.
Related
What is the best way to reduce the number of rows of a matrix by half in matlab?
What is the following command doing?
mymatrix = mymatrix(1:2:end,:);
Is there any better way available?
Short answer this is taking every second row of the matrix mymatrix starting with the first one (all odd-rows) and yes that is probably the easiest way. Added clarification based on comment from #Sardar_Usama
Longer version
end is matlab internal command that refers to the end of the array in the given dimension. roughly equivalent to size(var,dim).
so actually what mymatrix(1:2:end,:) can be re-written to mymatrix(1:2:size(mymatrix,1),:). Now if you actually look at 1:2:size(mymatrix,1) these are the rows that you are selecting.1, 3, 5, etc. You can actually specify whichever rows you want there, here are some examples.
1:floor(end/2); % first 'half'
floor(end/2)+1:end; % second 'half'
1:3:end; % every third element
1:2:floor(end/2); % every second element in the first 'half'
Added floor() to avoid problems for odd number lengths. In that case 'half' is not exactly half but rather roughly half. Alternatively ceil() depending on how you would like to define half for odd lengths.
I'm currently working on implementing a gradient check function in which it requires to get certain index values from the result matrix. Could someone tell me how to get a group of values from the matrix?
To be specific, for a result matrx res with size M x N, I'll need to get element res(3,1), res(4,2), res(1,3), res(2,4)...
In my case, M is dimension and N is batch size and there's a label array whose size is 1xbatch_size, [3 4 1 2...]. So the desired values are res(label(:),1:batch_size). Since I'm trying to practice vectorization programming and it's better not using loop. Could someone tell me how to get a group of value without a iteration?
Cheers.
--------------------------UPDATE----------------------------------------------
The only idea I found is firstly building a 'mask matrix' then use the original result matrix to do element wise multiplication (technically called 'Hadamard product', see in wiki). After that just get non-zero element out and do the sum operation, the code in matlab should look like:
temp=Mask.*res;
desired_res=temp(temp~=0); %Note: the temp(temp~=0) extract non-zero elements in a 'column' fashion: it searches temp matrix column by column then put the non-zero number into container 'desired_res'.
In my case, what I wanna do next is simply sum(desired_res) so I don't need to consider the order of those non-zero elements in 'desired_res'.
Based on this idea above, creating mask matrix is the key aim. There are two methods to do this job.
Codes are shown below. In my case, use accumarray function to add '1' in certain location (which are stored in matrix 'subs') and add '0' to other space. This will give you a mask matrix size [rwo column]. The usage of full(sparse()) is similar. I made some comparisons on those two methods (repeat around 10 times), turns out full(sparse) is faster and their time costs magnitude is 10^-4. So small difference but in a large scale experiments, this matters. One benefit of using accumarray is that it could define the matrix size while full(sparse()) cannot. The full(sparse(subs, 1)) would create matrix with size [max(subs(:,1)), max(subs(:,2))]. Since in my case, this is sufficient for my requirement and I only know few of their usage. If you find out more, please share with us. Thanks.
The detailed description of those two functions could be found on matlab's official website. accumarray and full, sparse.
% assume we have a label vector
test_labels=ones(10000,1);
% method one, accumarray(subs,1,[row column])
tic
subs=zeros(10000,2);
subs(:,1)=test_labels;
subs(:,2)=1:10000;
k1=accumarray(subs,1,[10, 10000]);
t1=toc % to compare with method two to check which one is faster
%method two: full(sparse(),1)
tic
k2=full(sparse(test_labels,1:10000,1));
t2=toc
this is a very basic problem but I didn't find any hints on it. Let's say I have a 2x4 matrix and I want to reduce the dimension of the matrix to only these columns that are in the sum larger than 1:
A=rand(2,4)
ind = sum(A,1).>1
That gives me an indicator of the columns I want to retain. Naively one would assume that I can do that:
A[:,ind]
which doesn't work as ind is a BitArray and only for Bool Arrays this is allowed, i.e., the following works
A[:,[true,true,false,true]]
in return, the following does work:
A[A.>0.5]
But it returns a vector of filtered elements.
What is the logic behind this and how do I solve my problem?
As noted in the comments, this is fixed by using a version of Julia which is >=v0.4.
I have a 5-by-200 matrix where the i:50:200, i=1:50 are related to each other, so for example the matrix columns 1,51,101,151 are related to each other, and columns 49,99,149,199 are also related to each other.
I want to use a for-loop to create another matrix that re-sorts the previous matrix based on this relationship.
My code is
values=zeros(5,200);
for j=1:50
for m=1:4:200
a=factor_mat(:,j:50:200)
values(:,m)=a
end
end
However, the code does not work.
Here's what's happening. Let's say we're on the first iteration of the outer loop, so j == 1. This effectively gives you:
j = 1;
for m=1:4:200
a=factor_mat(:,j:50:200)
values(:,m)=a;
end
So you're creating the same submatrix for a (j doesn't change) 50 times and storing it at different places in the values matrix. This isn't really what you want to do.
To create each 4-column submatrix once and store them in 50 different places, you need to use j to tell you which of the 50 you're currently processing:
for j=1:50
a=factor_mat(:,j:50:200);
m=j*4; %// This gives us the **end** of the current range
values(:,m-3:m)=a;
end
I've used a little trick here, because the indices of Matlab arrays start at 1 rather than 0. I've calculated the index of the last column we want to insert. For the first group, this is column 4. Since j == 1, j * 4 == 4. Then I subtract 3 to find the first column index.
That will fix the problem you have with your loops. But loops aren't very Matlab-ish. They used to be very slow; now they're adequate. But they're still not the cool way to do things.
To do this without loops, you can use reshape and permute:
a=reshape(factor_mat,[],50,4);
b=permute(a,[1,3,2]);
values=reshape(b,[],200);
So I'm working with sparse matrix and I have to find out different info about a very big one (10^6 size) and I need to find out the mean of the outlinks. Just to be sure I think mean is what you get from 3+4+5/3=4, 4 is the mean.
I thought of something like this:
[row,col] = find(A(:,2),1,'first')
and then I would do 1/numberInThatIndex or something similar, since it's a S-matrix (pretty sure it's called that).
And I would iterate column by column but for some reason it's not giving me the first number in each column, if I do find(A(:,1),1,'first') it does give me the first in the first column, but not in the second if I change it to A(:,2).
I'd also need something to store that index to access the value, I thought of a 2xN vector but I guess it's not the best idea. I mean, find is going to give me index, but I need the value in that index, and then store that or show it. Not sure if I'm explaining myself properly but I'm trying, sorry about that.
Just to be clear both when I input A(:,1) and A(:,2) it gives me index from the first column, and I do not want that, I want first element found from each column, so I can calculate the mean out of the number in that index.
edit: allright it seems like that indeed does work, but when I was checking the results I was putting 3817 instead of 3871 that was the given answer and so I found a 0 when I wanted something that's not a zero. Not sure if I should delete all of this.
To solve your problem, you can do the following:
numberNonZerosPerColumn = sum(S~=0,1);
meanValue = nanmean(1./numberNonZerosPerColumn);
Count the number of nonzero elements in every column n(i)
Compute the values v(i) that are stored there, which are defined by v(i) := 1/n(i)
Take the mean of those values where n(i) is not zero (i.e. summing all those values, where v(i) is not NaN and divide by the number of columns that contain at least one zero)
If you want to treat columns without any nonzero entry as v(i):= 0, but still use them in your mean, you can use:
numberNonZerosPerColumn = sum(S~=0,1);
meanValue = nansum(1./numberNonZerosPerColumn)/size(S,2);