Matlab function NNZ, numerical zero - matlab

I am working on a code in Least Square Non Negative solution recovery context on Matlab, and I need (with no more details because it's not that important for this question) to know the number of non zero elements in my matrices and arrays.
The function NNZ on matlab does exactly what I want, but it happens that I need more information about what Matlab thinks of a "zero element", it could be 0 itself, or the numerical zero like 1e-16 or less.
Does anybody has this information about the NNZ function, cause I couldn't get the original script
Thanks.
PS : I am not an expert on Matlab, so accept my apologies if it's a really simple task.
I tried "open nnz", on Matlab but I only get a small script of commented code lines...

Since nnz counts everything that isn't an exact zero (i.e. 1e-100 is non-zero), you just have to apply a relational operator to your data first to find how many values exceed some tolerance around zero. For a matrix A:
n = nnz(abs(A) > 1e-16);
Also, this discussion of floating-point comparison might be of interest to you.

You can add in a tolerance by doing something like:
nnz(abs(myarray)>tol);
This will create a binary array that is 1 when abs(myarray)>tol and 0 otherwise and then count the number of non-zero entries.

Related

problem with the error function (erf(x)) in matlab [duplicate]

The following error occurs quite frequently:
Subscript indices must either be real positive integers or logicals
I have found many questions about this but not one with a really generic answer. Hence I would like to have the general solution for dealing with this problem.
Subscript indices must either be real positive integers or logicals
In nearly all cases this error is caused by one of two reasons. Fortunately there is an easy check for this.
First of all make sure you are at the line where the error occurs, this can usually be achieved by using dbstop if error before you run your function or script. Now we can check for the first problem:
1. Somewhere an invalid index is used to access a variable
Find every variable, and see how they are being indexed. A variable being indexed is typically in one of these forms:
variableName(index,index)
variableName{index,index}
variableName{indices}(indices)
Now simply look at the stuff between the brackets, and select every index. Then hit f9 to evaluate the result and check whether it is a real positive integer or logical. Visual inspection is usually sufficient (remember that acceptable values are in true,false or 1,2,3,... BUT NOT 0) , but for a large matrix you can use things like isequal(index, round(index)), or more exactly isequal(x, max(1,round(abs(x)))) to check for real positive integers. To check the class you can use class(index) which should return 'logical' if the values are all 'true' or 'false'.
Make sure to check evaluate every index, even those that look unusual as per the example below. If all indices check out, you are probably facing the second problem:
2. A function name has been overshadowed by a user defined variable
MATLAB functions often have very intuitive names. This is convenient, but sometimes results in accidentally overloading (builtin) functions, i.e. creating a variable with the same name as a function for example you could go max = 9 and for the rest of you script/function Matlab will consider max to be a variable instead of the function max so you will get this error if you try something like max([1 8 0 3 7]) because instead of return the maximum value of that vector, Matlab now assumes you are trying to index the variable max and 0 is an invalid index.
In order to check which variables you have you can look at the workspace. However if you are looking for a systematic approach here is one:
For every letter or word that is followed by brackets () and has not been confirmed to have proper indices in step 1. Check whether it is actually a variable. This can easily be done by using which.
Examples
Simple occurrence of invalid index
a = 1;
b = 2;
c = 3;
a(b/c)
Here we will evaluate b/c and find that it is not a nicely rounded number.
Complicated occurrence of invalid index
a = 1;
b = 2;
c = 3;
d = 1:10;
a(b+mean(d(cell2mat({b}):c)))
I recommend working inside out. So first evaluate the most inner variable being indexed: d. It turns out that cell2mat({b}):c, nicely evaluates to integers. Then evaluate b+mean(d(cell2mat({b}):c)) and find that we don't have an integer or logical as index to a.
Here we will evaluate b/c and find that it is not a nicely rounded number.
Overloaded a function
which mean
% some directory\filename.m
You should see something like this to actually confirm that something is a function.
a = 1:4;
b=0:0.1:1;
mean(a) = 2.5;
mean(b);
Here we see that mean has accidentally been assigned to. Now we get:
which mean
% mean is a variable.
In Matlab (and most other programming languages) the multiplication sign must always be written. While in your math class you probably learned that you can write write a(a+a) instead of a*(a+a), this is not the same in matlab. The first is an indexing or function call, while the second is a multiplication.
>> a=0
a =
0
>> a*(a+a)
ans =
0
>> a(a+a)
Subscript indices must either be real
positive integers or logicals.
Answers to this question so far focused on the sources of this error, which is great. But it is important to understand the powerful yet very intuitive feature of matrix indexing in Matlab. Hence how indexing works and what is a valid index would help avoid this error in the first place by using valid indices.
At its core, given an array A of length n, there are two ways of indexing it.
Linear indexing: with subset of integers from 1 : n (duplicates allowed). 0 is not allowed, as Matlab arrays are 1-based, unless you use the method below. For higher-dimensional arrays, multiple subscripts are internally converted into a linear index, although in an efficient and transparent manner.
Logical indexing:wherein you use a n-length array of 0s and 1s, to pick those elements where indexing is true. In this case, unique(index) must have only 0 and 1.
So a valid indexing array into another array with n number of elements ca be:
entirely logical of the same size, or
linear with subsets of integers from 1:n
Keeping this in mind, invalid indexing error occurs when you mix the two types of indexing: one or more zeros occur in your linearly indexing array, or you mix 0s and 1s with anything other than 0s and 1s :)
There is tons of material online to learn this including this one:
http://www.mathworks.com/company/newsletters/articles/matrix-indexing-in-matlab.html

Image processing, 2D convolution [duplicate]

The following error occurs quite frequently:
Subscript indices must either be real positive integers or logicals
I have found many questions about this but not one with a really generic answer. Hence I would like to have the general solution for dealing with this problem.
Subscript indices must either be real positive integers or logicals
In nearly all cases this error is caused by one of two reasons. Fortunately there is an easy check for this.
First of all make sure you are at the line where the error occurs, this can usually be achieved by using dbstop if error before you run your function or script. Now we can check for the first problem:
1. Somewhere an invalid index is used to access a variable
Find every variable, and see how they are being indexed. A variable being indexed is typically in one of these forms:
variableName(index,index)
variableName{index,index}
variableName{indices}(indices)
Now simply look at the stuff between the brackets, and select every index. Then hit f9 to evaluate the result and check whether it is a real positive integer or logical. Visual inspection is usually sufficient (remember that acceptable values are in true,false or 1,2,3,... BUT NOT 0) , but for a large matrix you can use things like isequal(index, round(index)), or more exactly isequal(x, max(1,round(abs(x)))) to check for real positive integers. To check the class you can use class(index) which should return 'logical' if the values are all 'true' or 'false'.
Make sure to check evaluate every index, even those that look unusual as per the example below. If all indices check out, you are probably facing the second problem:
2. A function name has been overshadowed by a user defined variable
MATLAB functions often have very intuitive names. This is convenient, but sometimes results in accidentally overloading (builtin) functions, i.e. creating a variable with the same name as a function for example you could go max = 9 and for the rest of you script/function Matlab will consider max to be a variable instead of the function max so you will get this error if you try something like max([1 8 0 3 7]) because instead of return the maximum value of that vector, Matlab now assumes you are trying to index the variable max and 0 is an invalid index.
In order to check which variables you have you can look at the workspace. However if you are looking for a systematic approach here is one:
For every letter or word that is followed by brackets () and has not been confirmed to have proper indices in step 1. Check whether it is actually a variable. This can easily be done by using which.
Examples
Simple occurrence of invalid index
a = 1;
b = 2;
c = 3;
a(b/c)
Here we will evaluate b/c and find that it is not a nicely rounded number.
Complicated occurrence of invalid index
a = 1;
b = 2;
c = 3;
d = 1:10;
a(b+mean(d(cell2mat({b}):c)))
I recommend working inside out. So first evaluate the most inner variable being indexed: d. It turns out that cell2mat({b}):c, nicely evaluates to integers. Then evaluate b+mean(d(cell2mat({b}):c)) and find that we don't have an integer or logical as index to a.
Here we will evaluate b/c and find that it is not a nicely rounded number.
Overloaded a function
which mean
% some directory\filename.m
You should see something like this to actually confirm that something is a function.
a = 1:4;
b=0:0.1:1;
mean(a) = 2.5;
mean(b);
Here we see that mean has accidentally been assigned to. Now we get:
which mean
% mean is a variable.
In Matlab (and most other programming languages) the multiplication sign must always be written. While in your math class you probably learned that you can write write a(a+a) instead of a*(a+a), this is not the same in matlab. The first is an indexing or function call, while the second is a multiplication.
>> a=0
a =
0
>> a*(a+a)
ans =
0
>> a(a+a)
Subscript indices must either be real
positive integers or logicals.
Answers to this question so far focused on the sources of this error, which is great. But it is important to understand the powerful yet very intuitive feature of matrix indexing in Matlab. Hence how indexing works and what is a valid index would help avoid this error in the first place by using valid indices.
At its core, given an array A of length n, there are two ways of indexing it.
Linear indexing: with subset of integers from 1 : n (duplicates allowed). 0 is not allowed, as Matlab arrays are 1-based, unless you use the method below. For higher-dimensional arrays, multiple subscripts are internally converted into a linear index, although in an efficient and transparent manner.
Logical indexing:wherein you use a n-length array of 0s and 1s, to pick those elements where indexing is true. In this case, unique(index) must have only 0 and 1.
So a valid indexing array into another array with n number of elements ca be:
entirely logical of the same size, or
linear with subsets of integers from 1:n
Keeping this in mind, invalid indexing error occurs when you mix the two types of indexing: one or more zeros occur in your linearly indexing array, or you mix 0s and 1s with anything other than 0s and 1s :)
There is tons of material online to learn this including this one:
http://www.mathworks.com/company/newsletters/articles/matrix-indexing-in-matlab.html

Newton interpolation matlab [duplicate]

The following error occurs quite frequently:
Subscript indices must either be real positive integers or logicals
I have found many questions about this but not one with a really generic answer. Hence I would like to have the general solution for dealing with this problem.
Subscript indices must either be real positive integers or logicals
In nearly all cases this error is caused by one of two reasons. Fortunately there is an easy check for this.
First of all make sure you are at the line where the error occurs, this can usually be achieved by using dbstop if error before you run your function or script. Now we can check for the first problem:
1. Somewhere an invalid index is used to access a variable
Find every variable, and see how they are being indexed. A variable being indexed is typically in one of these forms:
variableName(index,index)
variableName{index,index}
variableName{indices}(indices)
Now simply look at the stuff between the brackets, and select every index. Then hit f9 to evaluate the result and check whether it is a real positive integer or logical. Visual inspection is usually sufficient (remember that acceptable values are in true,false or 1,2,3,... BUT NOT 0) , but for a large matrix you can use things like isequal(index, round(index)), or more exactly isequal(x, max(1,round(abs(x)))) to check for real positive integers. To check the class you can use class(index) which should return 'logical' if the values are all 'true' or 'false'.
Make sure to check evaluate every index, even those that look unusual as per the example below. If all indices check out, you are probably facing the second problem:
2. A function name has been overshadowed by a user defined variable
MATLAB functions often have very intuitive names. This is convenient, but sometimes results in accidentally overloading (builtin) functions, i.e. creating a variable with the same name as a function for example you could go max = 9 and for the rest of you script/function Matlab will consider max to be a variable instead of the function max so you will get this error if you try something like max([1 8 0 3 7]) because instead of return the maximum value of that vector, Matlab now assumes you are trying to index the variable max and 0 is an invalid index.
In order to check which variables you have you can look at the workspace. However if you are looking for a systematic approach here is one:
For every letter or word that is followed by brackets () and has not been confirmed to have proper indices in step 1. Check whether it is actually a variable. This can easily be done by using which.
Examples
Simple occurrence of invalid index
a = 1;
b = 2;
c = 3;
a(b/c)
Here we will evaluate b/c and find that it is not a nicely rounded number.
Complicated occurrence of invalid index
a = 1;
b = 2;
c = 3;
d = 1:10;
a(b+mean(d(cell2mat({b}):c)))
I recommend working inside out. So first evaluate the most inner variable being indexed: d. It turns out that cell2mat({b}):c, nicely evaluates to integers. Then evaluate b+mean(d(cell2mat({b}):c)) and find that we don't have an integer or logical as index to a.
Here we will evaluate b/c and find that it is not a nicely rounded number.
Overloaded a function
which mean
% some directory\filename.m
You should see something like this to actually confirm that something is a function.
a = 1:4;
b=0:0.1:1;
mean(a) = 2.5;
mean(b);
Here we see that mean has accidentally been assigned to. Now we get:
which mean
% mean is a variable.
In Matlab (and most other programming languages) the multiplication sign must always be written. While in your math class you probably learned that you can write write a(a+a) instead of a*(a+a), this is not the same in matlab. The first is an indexing or function call, while the second is a multiplication.
>> a=0
a =
0
>> a*(a+a)
ans =
0
>> a(a+a)
Subscript indices must either be real
positive integers or logicals.
Answers to this question so far focused on the sources of this error, which is great. But it is important to understand the powerful yet very intuitive feature of matrix indexing in Matlab. Hence how indexing works and what is a valid index would help avoid this error in the first place by using valid indices.
At its core, given an array A of length n, there are two ways of indexing it.
Linear indexing: with subset of integers from 1 : n (duplicates allowed). 0 is not allowed, as Matlab arrays are 1-based, unless you use the method below. For higher-dimensional arrays, multiple subscripts are internally converted into a linear index, although in an efficient and transparent manner.
Logical indexing:wherein you use a n-length array of 0s and 1s, to pick those elements where indexing is true. In this case, unique(index) must have only 0 and 1.
So a valid indexing array into another array with n number of elements ca be:
entirely logical of the same size, or
linear with subsets of integers from 1:n
Keeping this in mind, invalid indexing error occurs when you mix the two types of indexing: one or more zeros occur in your linearly indexing array, or you mix 0s and 1s with anything other than 0s and 1s :)
There is tons of material online to learn this including this one:
http://www.mathworks.com/company/newsletters/articles/matrix-indexing-in-matlab.html

Partitioning a number into a number of almost equal partitions

I would like to partition a number into an almost equal number of values in each partition. The only criteria is that each partition must be in between 60 to 80.
For example, if I have a value = 300, this means that 75 * 4 = 300.
I would like to know a method to get this 4 and 75 in the above example. In some cases, all partitions don't need to be of equal value, but they should be in between 60 and 80. Any constraints can be used (addition, subtraction, etc..). However, the outputs must not be floating point.
Also it's not that the total must be exactly 300 as in this case, but they can be up to a maximum of +40 of the total, and so for the case of 300, the numbers can sum up to 340 if required.
Assuming only addition, you can formulate this problem into a linear programming problem. You would choose an objective function that would maximize the sum of all of the factors chosen to generate that number for you. Therefore, your objective function would be:
(source: codecogs.com)
.
In this case, n would be the number of factors you are using to try and decompose your number into. Each x_i is a particular factor in the overall sum of the value you want to decompose. I'm also going to assume that none of the factors can be floating point, and can only be integer. As such, you need to use a special case of linear programming called integer programming where the constraints and the actual solution to your problem are all in integers. In general, the integer programming problem is formulated thusly:
You are actually trying to minimize this objective function, such that you produce a parameter vector of x that are subject to all of these constraints. In our case, x would be a vector of numbers where each element forms part of the sum to the value you are trying to decompose (300 in your case).
You have inequalities, equalities and also boundaries of x that each parameter in your solution must respect. You also need to make sure that each parameter of x is an integer. As such, MATLAB has a function called intlinprog that will perform this for you. However, this function assumes that you are minimizing the objective function, and so if you want to maximize, simply minimize on the negative. f is a vector of weights to be applied to each value in your parameter vector, and with our objective function, you just need to set all of these to -1.
Therefore, to formulate your problem in an integer programming framework, you are actually doing:
(source: codecogs.com)
V would be the value you are trying to decompose (so 300 in your example).
The standard way to call intlinprog is in the following way:
x = intlinprog(f,intcon,A,b,Aeq,beq,lb,ub);
f is the vector that weights each parameter of the solution you want to solve, intcon denotes which of your parameters need to be integer. In this case, you want all of them to be integer so you would have to supply an increasing vector from 1 to n, where n is the number of factors you want to decompose the number V into (same as before). A and b are matrices and vectors that define your inequality constraints. Because you want equality, you'd set this to empty ([]). Aeq and beq are the same as A and b, but for equality. Because you only have one constraint here, you would simply create a matrix of 1 row, where each value is set to 1. beq would be a single value which denotes the number you are trying to factorize. lb and ub are the lower and upper bounds for each value in the parameter set that you are bounding with, so this would be 60 and 80 respectively, and you'd have to specify a vector to ensure that each value of the parameters are bounded between these two ranges.
Now, because you don't know how many factors will evenly decompose your value, you'll have to loop over a given set of factors (like between 1 to 10, or 1 to 20, etc.), place your results in a cell array, then you have to manually examine yourself whether or not an integer decomposition was successful.
num_factors = 20; %// Number of factors to try and decompose your value
V = 300;
results = cell(1, num_factors);
%// Try to solve the problem for a number of different factors
for n = 1 : num_factors
x = intlinprog(-ones(n,1),1:n,[],[],ones(1,n),V,60*ones(n,1),80*ones(n,1));
results{n} = x;
end
You can then go through results and see which value of n was successful in decomposing your number into that said number of factors.
One small problem here is that we also don't know how many factors we should check up to. That unfortunately I don't have an answer to, and so you'll have to play with this value until you get good results. This is also an unconstrained parameter, and I'll talk about this more later in this post.
However, intlinprog was only released in recent versions of MATLAB. If you want to do the same thing without it, you can use linprog, which is the floating point version of integer programming... actually, it's just the core linear programming framework itself. You would call linprog this way:
x = linprog(f,A,b,Aeq,beq,lb,ub);
All of the variables are the same, except that intcon is not used here... which makes sense as linprog may generate floating point numbers as part of its solution. Due to the fact that linprog can generate floating point solutions, what you can do is if you want to ensure that for a given value of n, you could loop over your results, take the floor of the result and subtract with the final result, and sum over the result. If you get a value of 0, this means that you had a completely integer result. Therefore, you'd have to do something like:
num_factors = 20; %// Number of factors to try and decompose your value
V = 300;
results = cell(1, num_factors);
%// Try to solve the problem for a number of different factors
for n = 1 : num_factors
x = linprog(-ones(n,1),[],[],ones(1,n),V,60*ones(n,1),80*ones(n,1));
results{n} = x;
end
%// Loop through and determine which decompositions were successful integer ones
out = cellfun(#(x) sum(abs(floor(x) - x)), results);
%// Determine which values of n were successful in the integer composition.
final_factors = find(~out);
final_factors will contain which number of factors you specified that was successful in an integer decomposition. Now, if final_factors is empty, this means that it wasn't successful in finding anything that would be able to decompose the value into integer factors. Noting your problem description, you said you can allow for tolerances, so perhaps scan through results and determine which overall sum best matches the value, then choose whatever number of factors that gave you that result as the final answer.
Now, noting from my comments, you'll see that this problem is very unconstrained. You don't know how many factors are required to get an integer decomposition of your value, which is why we had to semi-brute-force it. In fact, this is a more general case of the subset sum problem. This problem is NP-complete. Basically, what this means is that it is not known whether there is a polynomial-time algorithm that can be used to solve this kind of problem and that the only way to get a valid solution is to brute-force each possible solution and check if it works with the specified problem. Usually, brute-forcing solutions requires exponential time, which is very intractable for large problems. Another interesting fact is that modern cryptography algorithms use NP-Complete intractability as part of their ciphertext and encrypting. Basically, they're banking on the fact that the only way for you to determine the right key that was used to encrypt your plain text is to check all possible keys, which is an intractable problem... especially if you use 128-bit encryption! This means you would have to check 2^128 possibilities, and assuming a moderately fast computer, the worst-case time to find the right key will take more than the current age of the universe. Check out this cool Wikipedia post for more details in intractability with regards to key breaking in cryptography.
In fact, NP-complete problems are very popular and there have been many attempts to determine whether there is or there isn't a polynomial-time algorithm to solve such problems. An interesting property is that if you can find a polynomial-time algorithm that will solve one problem, you will have found an algorithm to solve them all.
The Clay Mathematics Institute has what are known as Millennium Problems where if you solve any problem listed on their website, you get a million dollars.
Also, that's for each problem, so one problem solved == 1 million dollars!
(source: quickmeme.com)
The NP problem is amongst one of the seven problems up for solving. If I recall correctly, only one problem has been solved so far, and these problems were first released to the public in the year 2000 (hence millennium...). So... it has been about 14 years and only one problem has been solved. Don't let that discourage you though! If you want to invest some time and try to solve one of the problems, please do!
Hopefully this will be enough to get you started. Good luck!

Problem using the find function in MATLAB

I have two arrays of data that I'm trying to amalgamate. One contains actual latencies from an experiment in the first column (e.g. 0.345, 0.455... never more than 3 decimal places), along with other data from that experiment. The other contains what is effectively a 'look up' list of latencies ranging from 0.001 to 0.500 in 0.001 increments, along with other pieces of data. Both data sets are X-by-Y doubles.
What I'm trying to do is something like...
for i = 1:length(actual_latency)
row = find(predicted_data(:,1) == actual_latency(i))
full_set(i,1:4) = [actual_latency(i) other_info(i) predicted_info(row,2) ...
predicted_info(row,3)];
end
...in order to find the relevant row in predicted_data where the look up latency corresponds to the actual latency. I then use this to created an amalgamated data set, full_set.
I figured this would be really simple, but the find function keeps failing by throwing up an empty matrix when looking for an actual latency that I know is in predicted_data(:,1) (as I've double-checked during debugging).
Moreover, if I replace find with a for loop to do the same job, I get a similar error. It doesn't appear to be systematic - using different participant data sets throws it up in different places.
Furthermore, during debugging mode, if I use find to try and find a hard-coded value of actual_latency, it doesn't always work. Sometimes yes, sometimes no.
I'm really scratching my head over this, so if anyone has any ideas about what might be going on, I'd be really grateful.
You are likely running into a problem with floating point comparisons when you do the following:
predicted_data(:,1) == actual_latency(i)
Even though your numbers appear to only have three decimal places of precision, they may still differ by very small amounts that are not being displayed, thus giving you an empty matrix since FIND can't get an exact match.
One feature of floating point numbers is that certain numbers can't be exactly represented, since they aren't an integer power of 2. This occurs with the numbers 0.1 and 0.001. If you repeatedly add or multiply one of these numbers you can see some unexpected behavior. Amro pointed out one example in his comment: 0.3 is not exactly equal to 3*0.1. This can also be illustrated by creating your look-up list of latencies in two different ways. You can use the normal colon syntax:
vec1 = 0.001:0.001:0.5;
Or you can use LINSPACE:
vec2 = linspace(0.001,0.5,500);
You'd think these two vectors would be equal to one another, but think again!:
>> isequal(vec1,vec2)
ans =
0 %# FALSE!
This is because the two methods create the vectors by performing successive additions or multiplications of 0.001 in different ways, giving ever so slightly different values for some entries in the vector. You can take a look at this technical solution for more details.
When comparing floating point numbers, you should therefore do your comparisons using some tolerance. For example, this finds the indices of entries in the look-up list that are within 0.0001 of your actual latency:
tolerance = 0.0001;
for i = 1:length(actual_latency)
row = find(abs(predicted_data(:,1) - actual_latency(i)) < tolerance);
...
The topic of floating point comparison is also covered in this related question.
You may try to do the following:
row = find(abs(predicted_data(:,1) - actual_latency(i))) < eps)
EPS is accuracy of floating-point operation.
Have you tried using a tolerance rather than == ?