Column-wise average over unequal number of values in matrix

Column-wise average over unequal number of values in matrix - matlab

I am looking for an easy way to obtain the column-wise average of a subset of values in a matrix (indexed by a logical matrix), preferably without having to use a loop. The problem that I am experiencing is that because the number of values in each column is different, matlab collapses the values-matrix and its column-wise mean becomes the total mean (of the entire matrix). Is there a specific function or easy workaround for this problem? See below for an example.
%% define value matrix and logical indexing matrix
values=[1 2 3 4; 5 6 7 8; 9 10 11 12];
indices=[1 0 0 1; 0 1 0 1; 1 1 0 0];
indices=indices==1; %convert to logical
%% calculate column-wise average
mean(values(indices),1)

accumarray-based approach
Use the column index as a grouping variable for accumarray:
[~, col] = find(indices);
result = accumarray(col(:), values(indices), [size(values,2) 1], #mean, NaN).';
Note that:
The second line uses (:) to force the first input to be a column vector. This is needed because the first line may produce col as a row or column vector, depending on the size of indices.
NaN is used as fill value. This is specified as the fifth input to accumarray.
The third input to accumarray defines output size. It is necessary to specify it explicitly (as opposed to letting accumarray figure it out) in case the last columns of indices only contain false.
Hacky approach
Multiply and divide element-wise by indices. That will turn unwanted entries into NaN, which can then be ignored by mean using the 'omitnan' option:
result = mean(values.*indices./indices, 1, 'omitnan');
Manual approach
Multiply element-wise by indices, sum each column, and divide element-wise by the sum of each column of indices:
result = sum(values.*indices, 1) ./ sum(indices, 1);

Related

Reshape a 3D matrix to 4D matrix in MATLAB [duplicate]

I want to reshape pixel intensity by imagesize*1(column vector).
Imvect = reshape(I,imsize,1);
But why these error comes?
Error using reshape
To RESHAPE the number of elements must not change.

Let's start with the syntax used in the documentation:
B = reshape(A,sz1,...,szN)
What reshape does is to take the matrix A, straightens it out, and gives it a new size, that's determined by the 2nd, 3rd to the Nth argument. For this to be possible, you need to have the same number of elements in the input matrix as you have in the output matrix. You can't make a 1x5 vector turn into a 2x3 vector, as one element would be missing. The number of elements in the output matrix will be proportional to the product of sz1, sz2, ..., szN. Now, if you know you want N rows, but don't know exactly how many columns you have, you might use the [] syntax, that tells MATLAB to use as many columns as necessary to make the number of elements be equal.
So reshape(A, 2, [], 3) will become a 2xNx3 matrix, where, for a matrix with 24 elements, N will be 4.
Now, in your case this is not the case. numel(I) ~= imsize. If mod(numel(I), imsize) ~= 0 then your imsize is definitely incorrect. However, if mod(numel(I), imsize) == 0, then your error might be that you want imsize number of rows, and a number of columns that makes this possible. If it's the latter, then this should work:
Imvect = reshape(I,imsize, []);
If you simply want to make you matrix I a vector of size (numel(I), 1), then you should use the colon operator :, as such:
Imvect = I(:);
An alternative, if you really want to use reshape, is to specify that you want a single column, and let MATLAB select the number of rows, as such:
Imvect = reshape(I, [], 1);

Get the non-zero minimum of a column and its index

I want to find the minimum value of a column of a matrix of non-negative integers, excluding 0. I know the matrix is square and only has zeros on every element of its main diagonal (i.e. a(i,i)=0 for all i).
I have tried this:
[best_cost,index] = min(star_costs([1:i-1,i+1:nbr],i));
Where nbr is the size of my matrix.
However, the index that is returned is the index excluding the zero, not taking into account the ith element. For example, my first column is:
[0 9 11 5 18 13 14]'
so the code returns best_cost=5and index=3 because the 0 element is excluded. However, I would like to get index=4 as anyone would expect.
Of course just adding 1 does not have sense, as it could happen for any column and, except for the case of this first column, the minimum of the column could be above or below the diagonal.

Replace zeros with inf and then use min.
A(1:size(A,1)+1:end) = inf; %If the diagonal is to be excluded
%if all zeros are to be excluded including non-diagonal elements, use this instead:
%A(A==0) = inf; %Use tolerance if you have floating point numbers
[best_cost, index] = min(A);

As suggested in the comment I would try a work around changing the diagonal to the maximum value of the matrix, assuming that only the zeros on the diagonal are to be omitted.
%create random matrix
A = magic(4)
%change diagonal to the maximum
A(logical(eye(size(A)))) = max(A(:));
And now you can apply your search for the minimum

Checking equality of row elements in Matlab?

I have a matrix A in Matlab of dimension mxn. I want to construct a vector B of dimension mx1 such that B(i)=1 if all elements of A(i,:) are equal and 0 otherwise. Any suggestion? E.g.
A=[1 2 3; 9 9 9; 2 2 2; 1 1 4]
B=[0;1;1;0]

One way with diff -
B = all(diff(A,[],2)==0,2)
Or With bsxfun -
B = all(bsxfun(#eq,A,A(:,1)),2)

Here's another example that's a bit more obfuscated, but also does the job:
B = sum(histc(A,unique(A),2) ~= 0, 2) == 1;
So how does this work? histc counts the frequency or occurrence of numbers in a dataset. What's cool about histc is that we can compute the frequency along a dimension independently, so what we can do is calculate the frequency of values along each row of the matrix A separately. The first parameter to histc is the matrix you want to compute the frequency of values of. The second parameter denotes the edges, or which values you are looking at in your matrix that you want to compute the frequencies of. We can specify all possible values by using unique on the entire matrix. The next parameter is the dimension we want to operate on, and I want to work along all of the columns so 2 is specified.
The result from histc will give us a M x N matrix where M is the total number of rows in our matrix A and N is the total number of unique values in A. Next, if a row contains all equal values, there should be only one value in this row where all of the values were binned at this location where the rest of the values are zero. As such, we determine which values in this matrix are non-zero and store this into a result matrix, then sum along the columns of the result matrix and see if each row has a sum of 1. If it does, then this row of A qualifies as having all of the same values.
Certainly not as efficient as Divakar's diff and bsxfun method, but an alternative since he took the two methods I would have used :P

Some more alternatives:
B = var(A,[],2)==0;
B = max(A,[],2)==min(A,[],2)

Find mean of non-zero elements

I am assuming that the mean fucntion takes a matrix and calculate its mean by suming all element of the array, and divide it by the total number of element.
However, I am using this functionality to calculate the mean of my matrix. Then I come across a point where I don't want the mean function to consider the 0 elements of my matrix. Specifically, my matrix is 1x100000 array, and that maybe 1/3 to 1/2 of its element is all 0. If that is the case, can I replace the 0 element with NULL so that the matlab wouldn't consider them in calculating the mean? What else can I do?

Short version:
Use nonzeros:
mean( nonzeros(M) );
A longer answer:
If you are working with an array with 100K entries, with a significant amount of these entries are 0, you might consider working with sparse representation. It might also be worth considering storing it as a column vector, rather than a row vector.
sM = sparse(M(:)); %// sparse column
mean( nonzeros(sM) ); %// mean of only non-zeros
mean( sM ); %// mean including zeros

As you were asking "What else can I do?", here comes another approach, which does not depend on the statistics Toolbox or any other Toolbox.
You can compute them mean yourself by summing up the values and dividing by the number of nonzero elements (nnz()). Since summing up zeros does not affect the sum, this will give the desired result. For a 1-dimensional case, as you seem to have it, this can be done as follows:
% // 1 dimensional case
M = [1, 1, 0 4];
sum(M)/nnz(M) % // 6/3 = 2
For a 2-dimensional case (or n-dimensional case) you have to specify the dimension along which the summation should happen
% // 2-dimensional case (or n-dimensional)
M = [1, 1, 0, 4
2, 2, 4, 0
0, 0, 0, 1];
% // column means of nonzero elements
mean_col = sum(M, 1)./sum(M~=0, 1) % // [1.5, 1.5, 4, 2.5]
% // row means of nonzero elements
mean_row = sum(M, 2)./sum(M~=0, 2) % // [2; 2.667; 1.0]

To find the mean of only the non-zero elements, use logical indexing to extract the non-zero elements and then call mean on those:
mean(M(M~=0))

How to calculate the number of occurrences of each element in the 100000 vectors using Matlab? [duplicate]

This question already has answers here:
How can I count the number of elements of a given value in a matrix?
(7 answers)
Closed 8 years ago.
For 100000 vectors containing 40 different numbers between 1 and 100, how to calculate the number of occurrences of each element in the 100000 vectors.
example:
A = [2 5 6 9]; B = [3 6 9 1]
the result should be if the numbers are between 1 and 10: [1 2 3 4 5 6 7 8 9 10, 1 1 1 0 1 2 0 0 2 0]

It seems like you want to compute the histogram of all values.
Use hist command for that
n = hist( A(:), 1:100 );

Assuming that you have a variable A that stores all of these vectors (like in Shai's assumption), another alternative to hist is to use accumarray. This should automatically figure out the right amount of bins you have without specifying them like in hist. Try:
n = accumarray(A(:), 1);

You can also use the sparse function to do the counting:
% 100000x1 vector of integers in the range [1,100]
A = randi([1 100], [100000 1]);
% 100x1 array of counts
n = full(sparse(A, 1, 1, 100, 1));
As others have shown, this should give the same result as:
n = histc(A, 1:100);
or:
n = accumarray(A, 1, [100 1]);
(note that I explicitly specify the size in the sparse and accumarray calls. That's because if for a particular vector A values didn't go all the way up to 100, then the counts array n will be shorter than 100 in length).
All three methods are in fact mentioned in the tips section of the accumarray doc page, which is the most flexible of all three.
The behavior of accumarray is similar to that of the histc function.
Both functions group data into bins.
histc groups continuous values into a 1-D range using bin edges.
accumarray groups data using n-dimensional subscripts.
histc returns the bin counts using #sum.
accumarray can apply any function to the bins.
You can mimic the behavior of histc using accumarray with val = 1.
The sparse function also has accumulation behavior similar to that of accumarray.
sparse groups data into bins using 2-D subscripts, whereas accumarray groups data into bins using n-dimensional subscripts.
sparse adds elements that have identical subscripts into the output. accumarray adds elements that have identical subscripts into
the output by default, but can optionally apply any function to the bins.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse