Count the number of non-NaN values in each row of a 2D array - matlab

I have a matrix like this:
A = [1, 2, 3, 4, 5, NaN, NaN, NaN, NaN, NaN;
1, 2, 3, 4, 5, 6, 7, NaN, NaN, NaN;
1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
I would like to know how I can count the number of values in each row excluding any NaNs.
So I would get an output like:
output = [5;
7;
10;]

If A is a 2D array, e.g.
A = [1, 2, 3, 4, 5, NaN, NaN, NaN, NaN, NaN;
1, 2, 3, 4, 5, 6, 7, NaN, NaN, NaN;
1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
and you want to count the number of NaN entries on each row of A, you can simply use
>> sum(~isnan(A), 2)
ans =
5
7
10
Breakdown
isnan(A) returns a logical array of the same size as A, in which (logical1 indicates a NaN and 0 a non-NaN.
Note that you have to use the isnan function, here. In particular, the expression A == ~NaN is useless: it would simply return a logical array of the same size as A but full of (logical) 0's. Why? Because, according to floating-point arithmetic, NaN == NaN always returns "false" (i.e. logical 0, in MATLAB).
Then, by applying MATLAB's not operator (~) to that, you get a logical array of the same size as A, in which 1 indicates a non-NaN and 0 a NaN.
Finally, sum(~isnan(A), 2) returns a column vector in which the i-th entry corresponds to the number of logical 1's on the i-th row of ~isnan(A).
The resulting column vector is exactly what you want: a count, row by row, of the non-NaN entries in A.

Related

Why can SVD predict the score?

Why can SVD predict the score? I now have a matrix A, and then I know the specific values of the second row and the fourth column of matrix A
A = array([[5, 5, 3, 0, 5, 5],
[5, 0, 4, 0, 4, 4],
[0, 3, 0, 5, 4, 5],
[5, 4, 3, 3, 5, 5]]
)
The matrix decomposition is like this, but its second row and fourth column is -0.6417. Can this value also be used as the prediction result?
[[ 5.28849366 3.27680993 3.53241833 1.14752376 5.07268712 5.10856603]
[ 5.16272816 1.90208542 3.54790449 -0.64171367 3.6639954 3.40187912]
[ 0.21491233 3.74001967 -0.13316888 4.94723591 3.78868964 4.61660489]
[ 4.45908022 3.80580974 2.8984041 2.38455041 5.31300379 5.58222367]]

Find the largest index of the minimum in Matlab

I have an array of positive numbers and there are some duplicates. I want to find the largest index of the minimum value.
For example, if a=[2, 3, 1, 1, 4, 1, 3, 2, 1, 5, 5] then [i, v] = min(a) returns i=3, however I want i=9.
Using find and min.
A = [2, 3, 1, 1, 4, 1, 3, 2, 1, 5, 5];
minA = min(A);
maxIndex = max(find(A==minA));
min get the minimun value, and find return de index of values that meet the condition A==minA. max return de maximun index.
Here's a different idea, which only requires one function, sort:
[~,y] = sort(a,'descend');
i = y(end)
ans =
9
You can use imreginalmin as well with time complexity O(n):
largestMinIndex = find(imregionalmin(A),1,'last');

Accessing sparse matrix elements

I have a very large sparse matrix of the type 'scipy.sparse.coo.coo_matrix'. I can convert to csr with .tocsr(), however .todense() will not work since the array is too large. I want to be able to extract elements from the matrix as I would do with a regular array, so that I may pass row elements to a function.
For reference, when printed, the matrix looks as follows:
(7, 0) 0.531519363001
(48, 24) 0.400946334437
(70, 6) 0.684460955022
...
Make a matrix with 3 elements:
In [550]: M = sparse.coo_matrix(([.5,.4,.6],([0,1,2],[0,5,3])), shape=(5,7))
It's default display (repr(M)):
In [551]: M
Out[551]:
<5x7 sparse matrix of type '<class 'numpy.float64'>'
with 3 stored elements in COOrdinate format>
and print display (str(M)) - looks like the input:
In [552]: print(M)
(0, 0) 0.5
(1, 5) 0.4
(2, 3) 0.6
convert to csr format:
In [553]: Mc=M.tocsr()
In [554]: Mc[1,:] # row 1 is another matrix (1 row):
Out[554]:
<1x7 sparse matrix of type '<class 'numpy.float64'>'
with 1 stored elements in Compressed Sparse Row format>
In [555]: Mc[1,:].A # that row as 2d array
Out[555]: array([[ 0. , 0. , 0. , 0. , 0. , 0.4, 0. ]])
In [556]: print(Mc[1,:]) # like 2nd element of M except for row number
(0, 5) 0.4
Individual element:
In [560]: Mc[1,5]
Out[560]: 0.40000000000000002
The data attributes of these format (if you want to dig further)
In [562]: Mc.data
Out[562]: array([ 0.5, 0.4, 0.6])
In [563]: Mc.indices
Out[563]: array([0, 5, 3], dtype=int32)
In [564]: Mc.indptr
Out[564]: array([0, 1, 2, 3, 3, 3], dtype=int32)
In [565]: M.data
Out[565]: array([ 0.5, 0.4, 0.6])
In [566]: M.col
Out[566]: array([0, 5, 3], dtype=int32)
In [567]: M.row
Out[567]: array([0, 1, 2], dtype=int32)

How to access elements of a matrix based on values of a vector

So say I have the below matrix
[1, 2, 3,
4, 5, 6,
7, 8, 9]
And I have a vector [1,3]
I want to access the 1st and 3rd row which would return
[1,2,3
7,8,9]
I need to be able to scale this up to about 1000 rows being grabbed based on values in the vector.
if A is your matrix and v your vector of index, you just have to do A(v,:)

Aggregate 3rd dimension of a 3d array for the subscripts of the first dimension

I have a 3 Dimensional array Val 4xmx2 dimension. (m can be variable)
Val{1} = [1, 280; 2, 281; 3, 282; 4, 283; 5, 285];
Val{2} = [2, 179; 3, 180; 4, 181; 5, 182];
Val{3} = [2, 315; 4, 322; 5, 325];
Val{4} = [1, 95; 3, 97; 4, 99; 5, 101];
I have a subscript vector:
subs = {1,3,4};
What i want to get as output is the average of column 2 in the above 2D Arrays (only 1,3 an 4) such that the 1st columns value is >=2 and <=4.
The output will be:
{282, 318.5, 98}
This can probably be done by using a few loops, but just wondering if there is a more efficient way?
Here's a one-liner:
output = cellfun(#(x)mean(x(:,1)>=2 & x(:,1)<=4,2),Val(cat(1,subs{:})),'UniformOutput',false);
If subs is a numerical array (not a cell array) instead, i.e. subs=[1,3,4], and if output doesn't have to be a cell array, but can be a numerical array instead, i.e. output = [282,318.5,98], then the above simplifies to
output = cellfun(#(x)mean(x(x(:,1)>=2 & x(:,1)<=4,2)),Val(subs));
cellfun applies a function to each element of a cell array, and the indexing makes sure only the good rows are being averaged.