Replace NaN with zeros in cell array in MATLAB - matlab

I have a 50x25 cell array in the variable raw_data. Each cell contains a 200x150 matrix. I have a few NaN values scattered between all those values and I want to set them to zeros to make sure they do not interfere at later stages.
I have tried the following:
raw_data(cellfun(#(x) any(isnan(x), raw_data, 'UniformOutput', false)) = 0
When running the script, I get "Function 'subindex' is not defined for values of class 'cell'". Can anyone help me, please?
Thanks in advance!

How about this:
cellfun(#(x) nansum(x,ndims(x)+1), raw_data, 'UniformOutput', false)
Note if you're certain you'll only have 2D matrices in raw_data you can replace the ndims(x)+1 with 3.
The idea is to use nansum to sum along the 3rd dimension as this will preserve the shape of the first 2 dimensions and luckily nansum seems to convert NaN to 0 when all the elements being summed are NaN

Related

Generating an empty matrix of known dimensions

I want to generate a matrix of known dimension in Octave. The problem is that I do not want to initialize the matrix with zeros. The matrix will only contain 0 or 1 but the elements (cells) which do not get allocated any value, must remain blank. Plan to use such a matrix in 'Collaborative Filtering' algo.
I am new to both Ocatve and 'Collaborative Filtering' algos. Have tried to look for the solution on the net but to no avail. Keywords empty matrix on net refers to arrays with zero dimensions or char matrix with " " as values.
A numeric array cannot hold empty values. Typically in this case, people will use NaN as a placeholder value.
%// Initialize a 3D matrix of NaN values
data = nan(2, 3, 4);
size(data)
%// 2 3 4
It is then easy to differentiate a place holder value from real data. You can detect them using isnan.
The only way to create an array of empty values (and it is highly discouraged due to the performance hit) is to use cell arrays.
data = cell(2, 3, 4);
The problem is that I do not want to initialize the matrix with zeros. The matrix will only contain 0 or 1 but the elements (cells) which do not get allocated any value, must remain blank.
You are wrong. You may think the matrix only contains values 0 or 1 but actually it has a value of 0, 1, or unset. You can't have a blank value, you always need some value. Or at least not blank the way you are thinking. Taking it a very low level, all bites need to have a value (0 or 1), they can't be blank. Therefore, if you want a blank value you need to interpret some value as blank.
Your data will then have 3 states: true, false, and blank. You will so need at least 2 bits per point (note that even logical/bool data types, which only need 1 bit, actually take up 8 bits (1 byte)).
Using NaN
This may look like the simples solution but it's actually pretty bad. It will be a huge waste of memory (and you will have very large matrices if you're doing collaborative filtering).
The reason is that if you use NaN, your data needs to be of type single or double. That's at least 32 or 64 bits respectively. Remember that you only actually need 2 bits. Of course, you could make your own data type that does have a NaN value.
octave> vals = NaN (3, 3) # 3x3 matrix of type double (default)
vals =
NaN NaN NaN
NaN NaN NaN
NaN NaN NaN
octave> vals = NaN (3, 3, "single") # 3x3 matrix of type single
vals =
NaN NaN NaN
NaN NaN NaN
NaN NaN NaN
Using a cell matrix
A cell array is a data type where each cell can be any Octave value. This includes another cell array, a matrix of any dimensions, or even an empty array. You could use an empty array as blank, but this will be terribly inefficient, both for memory and speed, and you won't be able to use most functions since they will work on numeric arrays, not cell arrays.
octave> vals = cell (3, 3); # create 3x3 cell matrix
octave> vals{2,3} = true; # set value
octave> vals{2,3} = false; # set value
octave> vals{2,3} = []; # unset value
octave> cumsum (vals)
error: cumsum: wrong type argument 'cell'
octave> nnz (vals)
error: nnz: wrong type argument 'cell array'
octave> find (vals)
error: find: wrong type argument 'cell'
Using 8 bit integer
This is what I see being used most often. Using signed 8 bit, you can use 0 for blank, -1 for false, and 1 for true (or whatever makes the most sense for you).
octave> vals = zeros (3, 3, "int8");
Using a separate matrix to track blank values
If you really really want to have a matrix of 0 and 1, then you need a separate matrix to keep track of which values have been set. In such case, both matrices can be of type logical, therefore each taking up 8 bit per data points, which totals at 16 bit per data point. It also has the problem that you need to keep the two matrices in sync.
octave> vals = false (3, 3);
octave> set_vals = false (size (vals));
Making your own class
Either using the new classdef (will require Octave 4.0.0) of the old #class type, you can encapsulate any of the strategy above (I would personally use an 8 bit integer) on its own class. This moves the logic of knowing which value (-1 or 0) means blank if you use a signed 8 bit. Or if you prefer to use a separate matrix for blank values, then move the logic of keeping the values in sync to a setter method.

How can I get the values of non-NAN elements in a Matrix?

I have a huge matrix for which I need the row, column and values of non-NAN elements.
This works when I have zero (instead of NAN) and non-zero elements:
[rwpRow, rwpCol, rwpVal] = find( zerotest )
But when I do this for NAN matrix, I get all 1 values.
[rwpRow, rwpCol, rwpVal] = find(~isnan(nantest))
How can I do this?
The input to find is a logical array which is 1 for all non-nan elements. That 1 is what you get and find does not "see" the actual values. You have to split that up into separate calls:
select=~isnan(nantest)
[rwpRow, rwpCol] = find(select)
rwpVal=nantest(select)
You can only get the rows and columns from the call to find since it finds the 1s in ~isnan(nantest). Get all the non NaN values in the matrix in another step:
[rwpRow, rwpCol] = find(~isnan(nantest));
vals = nantest(~isnan(nantest));

Trouble with outputting a mean for each element in a cell array (MATLAB)

I have a 1x28 cell array called magV with each element containing a 246x247 matrix containing mostly NaNs.
I am trying to set up a for loop to go through each of these matrices and calculate a mean. The attempt so far:
mean_speeds = cell(1,28);
for x = 1 : 28
mean_speeds{x} = mean(magV{x});
end
This doesn't work; it just outputs another 1x28 cell array, with each element containing a 1x28 row of NaNs
What am I doing wrong?
Mean of anything containing NaN is a NaN. Remove . . .
mean(magV{x(~isnan(x))});
The mean function does not support NaN arguments. You can add a logic step to remove the invalid numbers then calculate the mean of the resulting array.
Or, you can use nanmean: see the nanmean Help Page
You can use cellfun to get rid of the loops.
If you want to ignore nan's
noNaN = cellfun(#(x) mean(x(~isnan(x))), magV, 'uni', 0);
If you want to treat them as zeros
zeroNaN = cellfun(#(x) sum(x(~isnan(x)))/numel(x), magV, 'uni', 0);

Transform a matrix 18x6692 to a matrix 1x120450 matlab

i have a char matrix (in matlab) 18x6692 and i want this to be a matrix with 1 row and 6692x18=120450 column.
I'm not able to do this, can you help me?
I also tried with a smaller matrix: from 2x4 to 1x8 with no results.
thank you
Simply use the colon operator and transpose the vector:
A = A(:).'
You can use the reshape function:
B = reshape(A,1,[]);
where A is the input matrix, 1 is the number of rows and [] is to indicate that the number of columns is to be calculated from the number of elements in A.
Note that this stacks all columns of A. If you want to concatenate along the rows, you can do this by transposing A first
B = reshape(A.',1,[]);

MATLAB Populate matrix with elements from a cell array

I have a matrix and I want to put into the third column of the matrix, elements from a cell array. How can I do this?
Here is an example of what I mean.
This is the matrix (E):
43.4350000000000 -88.5277780000000 NaN 733144
43.4350000000000 -88.5277780000000 NaN 733146
43.4350000000000 -88.5277780000000 NaN 733148
43.4350000000000 -88.5277780000000 NaN 733150
I want to take the NaN column (column 3) and put into it, the elements of a cell array (uID)
The cell array looks like this:
'027-0007'
'079-0026'
'119-8001'
'133-0027'
I used this code:
E(:,3) = reshape(repmat(uID',length(all_dates),1),[],1)
to replicate each line of uID a certain number of times and then reshape it into a column so that's it's the same size as a column of E.
However, when I run it now, the fact that E is a matrix and uID is a cell causes MATLAB to tell me thatConversion to double from cell is not possible. The part to the right of the = works fine. It's the placing the cell elements into E that's causing the problem.
Instead of inserting the data into a normal matrix, you can insert it into another cell
Ecell=num2cell(E);
Ecell(:,3)=uID;
The contents of your cell array are not numeric and therefore cannot be inserted into a numeric matrix. You can use str2double to convert strings cell arrays to numeric arrays like in the following
>> str2double({'3','17.5'})
ans =
3.0000 17.5000
but that's only when the string contents of the cell represent actual numbers, which doesn't seem to be true in your case.