Calculate mean above certain threshold for each column - matlab

My data is in a nxm array named x. What I want to do is calculate the average of the values in each of the columns above a certain threshold. So the output should be a 1xm vector.
mean(x) obviously does this without specifying a threshold.
mean(x>70) performs a truth check and basically returns the percentage of values above the threshold for each column
You could define a new variable as such
y = x > 70
and then
mean(x(y))
but this returns the average over all the columns of x.
There's a very cumbersome way of doing it by having a line of code for every column.
mean(x(y(1:end,1)))
And so on, but this is obviously ugly.
I feel like I'm missing something simple here. Hopefully someone will be able to help out.

You've forgotten that you can multiply the mask through element-wise
threshold = []; %define threshold here; it can be a 1xm vector or a scalar.
output = sum(x.*(x>threshold))./sum(x>threshold);

Related

How to change the value of a random subset of elements in a matrix without using a loop?

I'm currently attempting to select a subset of 0's in a very large matrix (about 400x300 elements) and change their value to 1. I am able to do this, but it requires using a loop where each instance it selects the next value in a randperm vector. In other words, 50% of the 0's in the matrix are randomly selected, one-at-a-time, and changed to 1:
z=1;
for z=1:(.5*numberofzeroes)
A(zeroposition(rpnumberofzeroes(z),1),zeroposition(rpnumberofzeroes(z),2))=1;
z=z+1;
end
Where 'A' is the matrix, 'zeroposition' is a 2-column-wide matrix with the positions of the 0's in the matrix (the "coordinates" if you like), and 'rpnumberofzeros' is a randperm vector from 1 to the number of zeroes in the matrix.
So for example, for z=20, the code might be something like this:
A(3557,2684)=1;
...so that the 0 which appears in this location within A will now be a 1.
It performs this loop thousands of times, because .5*numberofzeroes is a very big number. This inevitably takes a long time, so my question is can this be done without using a loop? Or at least, in some way that takes less processing resources/time?
As I said, the only thing that needs to be done is an entirely random selection of 50% (or whatever proportion) of the 0's changed to 1.
Thanks in advance for the help, and let me know if I can clear anything up! I'm new here, so apologies in advance if I've made any faux pa's.
That's very easy. I'd like to introduce you to my friend sub2ind. sub2ind allows you to take row and column coordinates of a matrix and convert them into linear column-major indices so that you can access multiple values in a matrix simultaneously in a single call. As such, the equivalent code you want is:
%// First access the values in rpnumberofzeroes
vals = rpnumberofzeroes(1:0.5*numberofzeroes, :);
%// Now, use the columns of these to determine which rows and columns we want
%// to access A
rows = zeroposition(vals(:,1), 1);
cols = zeroposition(vals(:,2), 2);
%// Get linear indices via sub2ind
ind1 = sub2ind(size(A), rows, cols);
%// Now set these locations to 1
A(ind1) = 1;
The first statement gets the first half of your matrix of coordinates stored in rpnumberofzeroes. The first column is the row coordinates, the second column is the column coordinates. Notice that in your code, you wish to use the values in zeroposition to access the locations in A. As such, extract out the corresponding rows and columns from rpnumberofzeroes to figure out the right rows and columns from zeroposition. Once that's done, we wish to use these new rows and columns from zeroposition and index into A. sub2ind requires three inputs - the size of the matrix you are trying to access... so in our case, that's A, the row coordinates and the column coordinates. The output is a set of column major indices that are computed for each row and column pair.
The last piece of the puzzle is to use these to index into A and set the locations to 1.
This can be accomplished with linear indexing as well:
% find linear position of all zeros in matrix
ix=find(abs(A)<eps);
% set one half of those, selected at random, to one.
A(ix(randperm(round(numel(ix)*.5)))=1;

Remove duplicates in correlations in matlab

Please see the following issue:
P=rand(4,4);
for i=1:size(P,2)
for j=1:size(P,2)
[r,p]=corr(P(:,i),P(:,j))
end
end
Clearly, the loop will cause the number of correlations to be doubled (i.e., corr(P(:,1),P(:,4)) and corr(P(:,4),P(:,1)). Does anyone have a suggestion on how to avoid this? Perhaps not using a loop?
Thanks!
I have four suggestions for you, depending on what exactly you are doing to compute your matrices. I'm assuming the example you gave is a simplified version of what needs to be done.
First Method - Adjusting the inner loop index
One thing you can do is change your j loop index so that it only goes from 1 up to i. This way, you get a lower triangular matrix and just concentrate on the values within the lower triangular half of your matrix. The upper half would essentially be all set to zero. In other words:
for i = 1 : size(P,2)
for j = 1 : i
%// Your code here
end
end
Second Method - Leave it unchanged, but then use unique
You can go ahead and use the same matrix like you did before with the full two for loops, but you can then filter the duplicates by using unique. In other words, you can do this:
[Y,indices] = unique(P);
Y will give you a list of unique values within the matrix P and indices will give you the locations of where these occurred within P. Note that these are column major indices, and so if you wanted to find the row and column locations of where these locations occur, you can do:
[rows,cols] = ind2sub(size(P), indices);
Third Method - Use pdist and squareform
Since you're looking for a solution that requires no loops, take a look at the pdist function. Given a M x N matrix, pdist will find distances between each pair of rows in a matrix. squareform will then transform these distances into a matrix like what you have seen above. In other words, do this:
dists = pdist(P.', 'correlation');
distMatrix = squareform(dists);
Fourth Method - Use the corr method straight out of the box
You can just use corr in the following way:
[rho, pvals] = corr(P);
corr in this case will produce a m x m matrix that contains the correlation coefficient between each pair of columns an n x m matrix stored in P.
Hopefully one of these will work!
this works ?
for i=1:size(P,2)
for j=1:i
Since you are just correlating each column with the other, then why not just use (straight from the documentation)
[Rho,Pval] = corr(P);
I don't have the Statistics Toolbox, but according to http://www.mathworks.com/help/stats/corr.html,
corr(X) returns a p-by-p matrix containing the pairwise linear correlation coefficient between each pair of columns in the n-by-p matrix X.

Selecting values plotted on a scatter3 plot

I have a 3d matrix of 100x100x100. Each point of that matrix has assigned a value that corresponds to a certain signal strength. If I plot all the points the result is incomprehensible and requires horsepower to compute, due to the large amount of points that are painted.
The next picture examplify the problem (in that case the matrix was 50x50x50 for reducing the computation time):
[x,y,z] = meshgrid(1:50,1:50,1:50);
scatter3(x(:),y(:),z(:),5,strength(:),'filled')
I would like to plot only the highest values (for example, the top 10). How can I do it?
One simple solution that came up in my mind is to asign "nan" to the values higher than the treshold.
Even the results are nice I think that it must be a most elegant solution to fix it.
Reshape it into an nx1 vector. Sort that vector and take the first ten values.
num_of_rows = size(M,1)
V = reshape(M,num_of_rows,1);
sorted_V = sort(V,'descend');
ind = sorted_V(1:10)
I am assuming that M is your 3D matrix. This will give you your top ten values in your matrix and the respective index. The you can use ind2sub() to get the x,y,z.

Matlab function for creating random number satisfying constraints

As an input I have two number x and y. x>y.
I want to create exactly y non-zero random number which their sum will be equal to x. I know randi([min max]) function . Can you help me?
If I got it right, you want something like this:
data = rand(1,y);
data = data * x / sum(data);
data will contain exactly y positive uniformly distributed numbers which sum equals to x.
Check out the file random vectors generator with fixed sum in Matlab FEX. I believe this will answer your question.
Leonid's approach will certainly generate a set of random numbers that have the correct sum, but it won't select uniformly over the allowed space. If this is important, an approach that will work is the following:
(with x = 1):
Generate Y-1 random numbers uniformly over [0,1].
Sort the Y-1 numbers from smallest to largest. Call these {y1,...,y_{N-1}}
Take as the Y random numbers the set {y_1-0 ,y_2-y1,...,1-y_{N-1}} == {n_1,... n_Y}.
These n_i clearly sum to one. It is easy to prove uniformity by considering the probability for a given realization of the n_i.

How to have normalize data around the average for the column in MATLAB?

I am trying to take a matrix and normalize the values in each cell around the average for that column. By normalize I mean subtract the value in each cell from the mean value in that column i.e. subtract the mean for Column1 from the values in Column1...subtract mean for ColumnN from the values in ColumnN. I am looking for script in Matlab. Thanks!
You could use the function mean to get the mean of each column, then the function bsxfun to subtract that from each column:
M = bsxfun(#minus, M, mean(M, 1));
Additionally, starting in version R2016b, you can take advantage of the fact that MATLAB will perform implicit expansion of operands to the correct size for the arithmetic operation. This means you can simply do this:
M = M-mean(M, 1);
Try the mean function for starters. Passing a matrix to it will result in all the columns being averaged and returns a row vector.
Next, you need to subtract off the mean. To do that, the matrices must be the same size, so use repmat on your mean row vector.
a=rand(10);
abar=mean(a);
abar=repmat(abar,size(a,1),1);
anorm=a-abar;
or the one-liner:
anorm=a-repmat(mean(a),size(a,1),1);
% Assuming your matrix is in A
m = mean(A);
A_norm = A - repmat(m,size(A,1),1)
As has been pointed out, you'll want the mean function, which when called without any additional arguments gives the mean of each column in the input. A slight complication then comes up because you can't simply subtract the mean -- its dimensions are different from the original matrix.
So try this:
a = magic(4)
b = a - repmat(mean(a),[size(a,1) 1]) % subtract columnwise mean from elements in a
repmat replicates the mean to match the data dimensions.