Create table from two column arrays - matlab

I have two column arrays with the same number of rows:
>> size(values)
ans =
12915 1
>> size(positions)
ans =
12915 1
values contains some NaN entries:
>> sum(isnan(values))
ans =
2500
while `positions' is filled with integer values:
>> sum(isnan(positions))
ans =
0
Some values in the two arrays:
values(randi(length(values), 10, 1))
ans =
0.0290
0.1000
0.0430
NaN
0.0310
0.9700
0.3170
0.1750
NaN
0.1410
positions(randi(length(positions), 10, 1))
ans =
5
8
12
11
10
6
10
3
9
4
If I try and create a table with those two columns I get an uncomprehensible (for me) error message:
>> table(values, positions)
Subscript indices must either be real positive integers or logicals.
I tried and removed the NaN values without success: I keep getting the same error message. However, I cannot understand the error message.
What's the problem?

You have very likely created a variable called table. If you type whos table you will probably get a result such as:
whos table
Name Size Bytes Class Attributes
table 1x1 8 double
You can solve this by simply clearing the table variable: clear table. This will leave the function but delete the variable.
Note that you have created the table variable somewhere, thus it's likely that you also use it somewhere (especially if you have a large project with mostly scripts and not functions). Just deleting the variable may result in a broken code. Therefore, I suggest you search for the variable name in your scripts and make sure you don't break anything.

The table(a,b) notation indexes the matrix table with a, and b. Since your values are non-integer, you get this error message.
I suppose, what you intend to do is to merge the two column vectors. For this you can use [ ], as
table = [values positions]
This will still contain nan values, but I guess this will not bother you
CORRECTION
If however you would like to add the values in table at their position, you may use
table(positions)=values

Related

sparse matrix matlab unexpected behavior

I am creating a sparse matrix
sp = sparse(I,J,Val,X,Y)
My Val matrix is a ones matrix. Much to my surprise the sp matrix does not contain only zeros and ones. I suppose that this happens because in some cases there are duplicates in I,J. I mean the sp(1,1) is set to 1 2 times, and this makes it 2.
Question 1: Is my assumption true? Instead of overwriting the value, does MATLAB really add it?
Question 2: How can we get around this, given that it would be very troublesome to manipulate I and J. Something I can think of, is to use find (thus guaranteeing uniqueness) and then recreate the matrix using ones once more. Any better suggestion?
Question 1: Is my assumption true? Instead of overwriting the value, does Matlab really add it?
Correct. If you have duplicate row and column values each with their own values, MATLAB will aggregate them all into the same row and column location by adding them.
This is clearly seen in the documentation but as a reproducible example, suppose I have the following row and column locations and their associated values at these locations:
i = [6 6 6 5 10 10 9 9].';
j = [1 1 1 2 3 3 10 10].';
v = [100 202 173 305 410 550 323 121].';
Note that these are column vectors as this shape is the expected input. In a neater presentation:
>> [i j v]
ans =
6 1 100
6 1 202
6 1 173
5 2 305
10 3 410
10 3 550
9 10 323
9 10 121
We can see that there are three values that get mapped to location (6, 1), two values that get mapped to location (10, 3) and finally two that get mapped to location (9, 10).
By creating the sparse matrix and displaying it, we thus get:
>> S = sparse(i,j,v)
S =
(6,1) 475
(5,2) 305
(10,3) 960
(9,10) 444
As you can see, the three values mapped to (6, 1) are summed: 100 + 202 + 173 = 475. You can verify this with the other duplicate row and column locations.
Question 2: How can we get around this, given that it would be very troublesome to manipulate I and J. Something I can think of, is to use find (thus guaranteeing uniqueness) and then recreate the matrix using ones once more. Any better suggestion?
There are two possible ways to mitigate this if it is truly your desire to only have a binary matrix.
The first way which may be more preferable to you as you mentioned that manipulating the row and column locations is troublesome is to create the matrix that you have now, but then convert it to logical so that any values that are non-zero are set to 1:
>> S = S ~= 0
S =
10×10 sparse logical array
(6,1) 1
(5,2) 1
(10,3) 1
(9,10) 1
If you require that the precision of the matrix be back in its original double form, cast the result after you convert to logical:
>> S = double(S ~= 0)
S =
(6,1) 1
(5,2) 1
(10,3) 1
(9,10) 1
The second way if you wish is to work on your row and column locations so that you filter out any indices that are non-unique, then create a vector of ones for val that is as long as the unique row and column locations. You can use the unique function to help you do that. Concatenate the row and column locations in a two column matrix and specify that you want to operate on 'rows'. This means that each row is considered an input rather than individual elements in a matrix. Once you find the unique row and column locations, use these as input for creating the sparse matrix:
>> unique_vals = unique([i j], 'rows')
unique_vals =
5 2
6 1
9 10
10 3
>> vals = ones(size(unique_vals, 1));
>> S = sparse(unique_vals(:, 1), unique_vals(:, 2), vals)
S =
(6,1) 1
(5,2) 1
(10,3) 1
(9,10) 1

Generating an empty matrix of known dimensions

I want to generate a matrix of known dimension in Octave. The problem is that I do not want to initialize the matrix with zeros. The matrix will only contain 0 or 1 but the elements (cells) which do not get allocated any value, must remain blank. Plan to use such a matrix in 'Collaborative Filtering' algo.
I am new to both Ocatve and 'Collaborative Filtering' algos. Have tried to look for the solution on the net but to no avail. Keywords empty matrix on net refers to arrays with zero dimensions or char matrix with " " as values.
A numeric array cannot hold empty values. Typically in this case, people will use NaN as a placeholder value.
%// Initialize a 3D matrix of NaN values
data = nan(2, 3, 4);
size(data)
%// 2 3 4
It is then easy to differentiate a place holder value from real data. You can detect them using isnan.
The only way to create an array of empty values (and it is highly discouraged due to the performance hit) is to use cell arrays.
data = cell(2, 3, 4);
The problem is that I do not want to initialize the matrix with zeros. The matrix will only contain 0 or 1 but the elements (cells) which do not get allocated any value, must remain blank.
You are wrong. You may think the matrix only contains values 0 or 1 but actually it has a value of 0, 1, or unset. You can't have a blank value, you always need some value. Or at least not blank the way you are thinking. Taking it a very low level, all bites need to have a value (0 or 1), they can't be blank. Therefore, if you want a blank value you need to interpret some value as blank.
Your data will then have 3 states: true, false, and blank. You will so need at least 2 bits per point (note that even logical/bool data types, which only need 1 bit, actually take up 8 bits (1 byte)).
Using NaN
This may look like the simples solution but it's actually pretty bad. It will be a huge waste of memory (and you will have very large matrices if you're doing collaborative filtering).
The reason is that if you use NaN, your data needs to be of type single or double. That's at least 32 or 64 bits respectively. Remember that you only actually need 2 bits. Of course, you could make your own data type that does have a NaN value.
octave> vals = NaN (3, 3) # 3x3 matrix of type double (default)
vals =
NaN NaN NaN
NaN NaN NaN
NaN NaN NaN
octave> vals = NaN (3, 3, "single") # 3x3 matrix of type single
vals =
NaN NaN NaN
NaN NaN NaN
NaN NaN NaN
Using a cell matrix
A cell array is a data type where each cell can be any Octave value. This includes another cell array, a matrix of any dimensions, or even an empty array. You could use an empty array as blank, but this will be terribly inefficient, both for memory and speed, and you won't be able to use most functions since they will work on numeric arrays, not cell arrays.
octave> vals = cell (3, 3); # create 3x3 cell matrix
octave> vals{2,3} = true; # set value
octave> vals{2,3} = false; # set value
octave> vals{2,3} = []; # unset value
octave> cumsum (vals)
error: cumsum: wrong type argument 'cell'
octave> nnz (vals)
error: nnz: wrong type argument 'cell array'
octave> find (vals)
error: find: wrong type argument 'cell'
Using 8 bit integer
This is what I see being used most often. Using signed 8 bit, you can use 0 for blank, -1 for false, and 1 for true (or whatever makes the most sense for you).
octave> vals = zeros (3, 3, "int8");
Using a separate matrix to track blank values
If you really really want to have a matrix of 0 and 1, then you need a separate matrix to keep track of which values have been set. In such case, both matrices can be of type logical, therefore each taking up 8 bit per data points, which totals at 16 bit per data point. It also has the problem that you need to keep the two matrices in sync.
octave> vals = false (3, 3);
octave> set_vals = false (size (vals));
Making your own class
Either using the new classdef (will require Octave 4.0.0) of the old #class type, you can encapsulate any of the strategy above (I would personally use an 8 bit integer) on its own class. This moves the logic of knowing which value (-1 or 0) means blank if you use a signed 8 bit. Or if you prefer to use a separate matrix for blank values, then move the logic of keeping the values in sync to a setter method.

Subscript indices must either be real positive integers or logicals

I write a function to sum each row of a matrix which have three rows.
Then use a matrix which have one row and three columns to divide the previous result.
But I keep getting that error. I know the subscript should not be a decimal or negative number. But I still can not find the culprit. Please help, thanks.
% mean_access_time(ipinfo_dist, [306, 32, 192])
% 'ipinfo_dist' is a matrix which have three rows and column is not fixed.
function result = mean_access_time(hash_mat, element_num)
access_time_sum = sum(rot90(hash_mat));
result = bsxfun (#rdivide, access_time_sum, element_num);
For example:
A=
1 2
3 4
5 6
B= 7 8 9
Then I want to get
[(1+2)/7, (3+4)/8, (5+6)/9]
Update:
>> which rot90
/lou/matlab/toolbox/matlab/elmat/rot90.m
>> which sum
built-in (/lou/matlab/toolbox/matlab/datafun/#uint8/sum) % uint8 method
Culprit:
I used mean_access_time as a variable in the previous command line.
It seems like you have overridden a built-in function ( rot90 or sum ) with variable name.
Type
>> dbstop if error
And run your code.
When the error occurs type
K>> which rot90
K>> which sum
See if you get a built-in function or a variable name.

matrix get min values of a matrix before max values occurred

I was trying to get the min values of a matrix before the max values of the matrix occurred. I have two matrices: matrix data and matrix a. Matrix a is a subset of matrix data and is composed of the max values of matrix data. I have the following code but obviously doing something wrong.
edit:
Matrix a are the max values of matrix data. I derived it from:
for x=1:size(data,1)
a(x)=max(data(x,:));
end
a=a'
clear x
matrix b code:
for x=1:size(data,1)
b(x)=min(data(x,(x<data==a)));
end
b=b'
clear x
matrix data matrix a matrix b
1 2 3 4 4 1
6 5 4 7 7 4
9 6 12 5 12 6
I need all the min values that occurred before to matrix a occurred in matrix data
Short and simple:
[a,idxmax] = max(data,[],2);
b = arrayfun(#(ii) min(data(ii,1:idxmax(ii))), 1:size(data,1));
which is the same as
b=NaN(1,size(data,1)); % preallocation!
for ii=1:size(data,1)
b(ii) = min(data(ii,1:idxmax(ii)));
end
Ignore maximum itself
If you want minimum of everything really before (and not including the maximum), it's possible that the maximum is the first number, and you try taking minimum of an empty matrix. Solution then is to use cell output, which can be empty:
b = arrayfun(#(ii) min(data(ii,1:idxmax(ii)-1)), 1:size(data,1),'uni',false);
Replace empty cells with NaN
If you want to replace empty cells to Nan and then back to a matrix use this:
b(cellfun(#isempty,b))={NaN};
b=cell2mat(b);
or simply use the earlier version and replace b(ii) with NaN when it is equal to a(ii) same outcome:
b = arrayfun(#(ii) min(data(ii,1:idxmax(ii))), 1:size(data,1));
b(b'==a) = NaN
Example:
data=magic(4)
16 2 3 13
5 11 10 8
9 7 6 12
4 14 15 1
outputs:
a' = 16 11 12 15
b =
16 5 6 4
and
b =[1x0 double] [5] [6] [4]
for the 2nd solution using cell output and ignoring the maximum itself also.
And btw:
for x=1:size(data,1)
a(x)=max(data(x,:));
end
a=a'
clear x
can be replaced with
a=max(data,[],2);
It's not pretty but this is the only way I found so far of doing this kind of thing without a loop.
If loops are ok I would recommend Gunther Struyf answer as the most compact use of matlab's in-built array looping function, arrayfun.
Some of the transposition etc may be superfluous if you're wanting column mins instead of row...
[mx, imx] = max(data');
inds = repmat(1:size(data,2), [size(data,1),1]);
imx2 = repmat(imx', [1, size(data,2)]);
data2 = data;
data2(inds >= imx2) = inf;
min(data2');
NOTE: if data is not needed we can remove the additional data2 variable, and reduce the line count.
So to demonstrate what this does, (and see if I understood the question correctly):
for input
>> data = [1,3,-1; 5,2,1]
I get minima:
>> min(data2')
ans = [1, inf]
I.e. it only found the min values before the max values for each row, and anything else was set to inf.
In words:
For each row get index of maximum
Generate matrix of column indices
Use repmat to generate a matrix, same size as data where each row is index of maximum
Set data to infinity where column index > max_index matrix
find min as usual.

How to use matrix from answer further?

If I write a random matrix (A) and get results:
ans = 1 2 3 4 %next row 5 6 7 8
how can I get it written in this form:
A = [1,2,3,4;5,6,7,8]; ?
(Of course I want to avoid retyping or copy-pasting it)
If I understand your question correctly, mat2str is what you are looking for. Although it won't use commas, but spaces, and overwrite ans (i.e. ans will be of type char afterwards).
Example (the second argument limits the number of digits):
>> rand(2,3); mat2str(ans,2)
ans =
[0.42 0.79 0.66;0.92 0.96 0.036]
The last answer that you calculated is saved in a special variable named ans. Simply assign that value to A.
% some calculations
[1,2,3,4;5,6,7,8]
% assign to A
A = ans;