Issue in viewing double values - matlab

I am having an issue with viewing double data in matlab console. Actually, I am importing a matrix from my data file. The value of a particular row and column was 1.543 but in the console when I use disp(x) where x is the matrix imported, it is showing as 1.0e+03 * 0.0002. However, when I try to access that particular element in the matrix using disp(x(25,25)) where 25 and 25 are the row and column numbers it is showing to be 1.543. So I am confused. Any clarifications. It is just that when I print the whole matrix it is showing as 1.0e+03 * 0.0002.

The following command should fix it. It is only a display issue, the precision of the actual values in the matrix are not affected:
format shortG

That happens due to high dynamic range of your data.
Try for example :
x = [10^-10 10^10];
disp(x);
The result is:
1.0e+010 *
0.0000 1.0000
It looks like the first value is zero, but it isn't. It is almost zero compared to the second one. That is not surprising. Try to add to the big value the small one, and subtract, and you get zero. That is due to floating point arithmetic.The following expression is true
isequal( (x(1)+x(2)) - x(2) , 0)
What can be done?
1) A really high dynamic range can cause troubles in any kind of computations. Try to understand where it came from, and solve the problem in a broader context.
2). You can try to set
format long
It can improve the situation visually for some of the cases.

Related

What is this code doing? Machine Learning

I'm just learning matlab and I have a snippet of code which I don't understand the syntax of. The x is an n x 1 vector.
Code is below
p = (min(x):(max(x)/300):max(x))';
The p vector is used a few lines later to plot the function
plot(p,pp*model,'r');
It generates an arithmetic progression.
An arithmetic progression is a sequence of numbers where the next number is equal to the previous number plus a constant. In an arithmetic progression, this constant must stay the same value.
In your code,
min(x) is the initial value of the sequence
max(x) / 300 is the increment amount
max(x) is the stopping criteria. When the result of incrementation exceeds this stopping criteria, no more items are generated for the sequence.
I cannot comment on this particular choice of initial value and increment amount, without seeing the surrounding code where it was used.
However, from a naive perspective, MATLAB has a linspace command which does something similar, but not exactly the same.
Certainly looks to me like an odd thing to be doing. Basically, it's creating a vector of values p that range from the smallest to the largest values of x, which is fine, but it's using steps between successive values of max(x)/300.
If min(x)=300 and max(x)=300.5 then this would only give 1 point for p.
On the other hand, if min(x)=-1000 and max(x)=0.3 then p would have thousands of elements.
In fact, it's even worse. If max(x) is negative, then you would get an error as p would start from min(x), some negative number below max(x), and then each element would be smaller than the last.
I think p must be used to create pp or model somehow as well so that the plot works, and without knowing how I can't suggest how to fix this, but I can't think of a good reason why it would be done like this. using linspace(min(x),max(x),300) or setting the step to (max(x)-min(x))/299 would make more sense to me.
This code examines an array named x, and finds its minimum value min(x) and its maximum value max(x). It takes the maximum value and divides it by the constant 300.
It doesn't explicitly name any variable, setting it equal to max(x)/300, but for the sake of explanation, I'm naming it "incr", short for increment.
And, it creates a vector named p. p looks something like this:
p = [min(x), min(x) + incr, min(x) + 2*incr, ..., min(x) + 299*incr, max(x)];

Subtracting matrices of unequal dimension

When I try to do the following command I get an error.
err = sqrt(mean((xi256-xc256).^2))
I am aware that the matrix sizes are different.
whos xi256 xc256` gives:
Name Size Bytes Class Attributes
xc256 27x1 216 double
xi256 513x1 4104 double
I am supposed to negate find the difference of these two matrices. In fact the command given at the top was in the course notes and the course has been running for several years! I have tried googling ways to resolve this error to perform this subtraction regardless but have found no solution. Maybe one of the matrices can be scaled to match the dimensions of the other? However, I have not been able to find any such functions that would let me do this.
I need to find the RMS error in a given set of data. xc256 was calculated through a numerical method, xi256 gives the true value.
Edit: I was able to use another set of results.
First check that xc256 is correctly computed and that you do not have a matrix transposition gone wrong or something like that. Computing the RMS between two vector of different sizes does not make sense, and padding or replicating will get you rid of the error, but is most probably not what you actually want.
There are just two situations that I can think of, I will list them here:
The line is wrong (not likely as it looks pretty normal, but make sure to check the book!)
The input of the line is wrong
Assuming it is in point 2, again there are two possibilities:
xi256 is of incorrect size (likelyhood of this depends on how you got it)
xc256 is of incorrect size
Assuming it is again point 2, there are yet again 2 likely possibilities:
xc256 should be a scalar
xc256 should be a vector with the same size as xi256
From here on there is insufficient information to continue, but check whether you accidentally made it 27 times as long, or 19 times too short somewhere. Just use some breakpoints throughout the code to see how the size develops.

get vector which's mean is zero from arbitrary vector

as i know to get zero mean vector from given vector,we should substract mean of given vector from each memeber of this vector.for example let us see following example
r=rand(1,6)
we get
0.8687 0.0844 0.3998 0.2599 0.8001 0.4314
let us create another vector s by following operation
s=r-mean(r(:));
after this we get
0.3947 -0.3896 -0.0743 -0.2142 0.3260 -0.0426
if we calculate mean of s by following formula
mean(s)
we get
ans =
-5.5511e-017
actually as i have checked this number is very small
-5.5511*exp(-017)
ans =
-2.2981e-007
so we should think that our vector has mean zero?so it means that that small deviation from 0 is because of round off error?for exmaple when we are creating white noise or such kind off random uncorrelated sequence of data,actually it is already supposed that even for such small data far from 0,it has zero mean and it is supposed in this case that for example for this case
-5.5511e-017 =0 ?
approximately of course
e-017 means 10 to the power of -17 (10^-17) but still the number is very small and hypothetically it is 0. And if you type
format long;
you will see the real precision used by Matlab
Actually you can refer to the eps command. Although matlab uses double that can encode numbers down to 2.2251e-308 the precission is determined size of the number.
Use it in the format eps(number) - it tell you the how large is the influence of the least significant bit.
on my machine eg. eps(0.3) returns 5.5511e-17 - exactly the number you report.

MATLAB: using the find function to get indices of a certain value in an array

I have made an array of doubles and when I want to use the find command to search for the indices of specific values in the array, this yields an empty matrix which is not what I want. I assume the problem lies in the precision of the values and/or decimal places that are not shown in the readout of the array.
command:
peaks=find(y1==0.8236)
array readout:
y1 =
Columns 1 through 11
0.2000 0.5280 0.8224 0.4820 0.8239 0.4787 0.8235 0.4796 0.8236 0.4794 0.8236
Columns 12 through 20
0.4794 0.8236 0.4794 0.8236 0.4794 0.8236 0.4794 0.8236 0.4794
output:
peaks =
Empty matrix: 1-by-0
I tried using the command
format short
but I guess this only truncates the displayed values and not the actual values in the array.
How can I used the find command to give an array of indices?
By default, each element of a numerical matrix in Matlab is stored using floating point double precision. As you surmise in the question format short and format long merely alter the displayed format, rather than the actual format of the numbers.
So if y1 is created using something like y1 = rand(100, 1), and you want to find particular elements in y1 using find, you'll need to know the exact value of the element you're looking for to floating point double precision - which depending on your application is probably non-trivial. Certainly, peaks=find(y1==0.8236) will return the empty matrix if y1 only contains values like 0.823622378...
So, how to get around this problem? It depends on your application. One approach is to truncate all the values in y1 to a given precision that you want to work in. Funnily enough, a SO matlab question on exactly this topic attracted two good answers about 12 hours ago, see here for more.
If you do decide to go down this route, I would recommend something like this:
a = 1e-4 %# Define level of precision
y1Round = round((1/a) * y1); %# Round to appropriate precision, and leave y1 in integer form
Index = find(y1Round == SomeValue); %# Perform the find operation
Note that I use the find command with y1Round in integer form. This is because integers are stored exactly when using floating point double, so you won't need to worry about floating point precision.
An alternative approach to this problem would be to use find with some tolerance for error, for example:
Index = find(abs(y1 - SomeValue) < Tolerance);
Which path you choose is up to you. However, before adopting either of these approaches, I would have a good hard look at your application and see if it can be reformulated in some way such that you don't need to search for specific "real" numbers from among a set of "reals". That would be the most ideal outcome.
EDIT: The code advocated in the other two answers to this question is neater than my second approach - so I've altered it accordingly.
Testing for equality with floating-point numbers is almost always a bad idea. What you probably want to do is test to see which numbers are close enough to the target value:
peaks = find( abs( y - .8236 ) < .0001 );
The problem is indeed with the precision. The array that you see displayed is not the actual array, as the actual array has more digits for each of the numbers. Changing the format just changes the way in which the array is displayed, so it doesn't solve the problem.
You have two options, either modify the array or modify what you are looking for. It is probably better to modify what you are looking for, since then you are not changing the actual values.
So instead of looking for equality, you can look for proximity (so the difference between the number you are searching for and the number in the array is at most some small epsilon):
peaks = find( abs(y1-0.8236) < epsilon )
In general, when you are dealing with floats, always try to avoid exact comparisons and use some error thresholds, since the representation of these numbers is limited so they are often stored with small inaccuracies.

Problem using the find function in MATLAB

I have two arrays of data that I'm trying to amalgamate. One contains actual latencies from an experiment in the first column (e.g. 0.345, 0.455... never more than 3 decimal places), along with other data from that experiment. The other contains what is effectively a 'look up' list of latencies ranging from 0.001 to 0.500 in 0.001 increments, along with other pieces of data. Both data sets are X-by-Y doubles.
What I'm trying to do is something like...
for i = 1:length(actual_latency)
row = find(predicted_data(:,1) == actual_latency(i))
full_set(i,1:4) = [actual_latency(i) other_info(i) predicted_info(row,2) ...
predicted_info(row,3)];
end
...in order to find the relevant row in predicted_data where the look up latency corresponds to the actual latency. I then use this to created an amalgamated data set, full_set.
I figured this would be really simple, but the find function keeps failing by throwing up an empty matrix when looking for an actual latency that I know is in predicted_data(:,1) (as I've double-checked during debugging).
Moreover, if I replace find with a for loop to do the same job, I get a similar error. It doesn't appear to be systematic - using different participant data sets throws it up in different places.
Furthermore, during debugging mode, if I use find to try and find a hard-coded value of actual_latency, it doesn't always work. Sometimes yes, sometimes no.
I'm really scratching my head over this, so if anyone has any ideas about what might be going on, I'd be really grateful.
You are likely running into a problem with floating point comparisons when you do the following:
predicted_data(:,1) == actual_latency(i)
Even though your numbers appear to only have three decimal places of precision, they may still differ by very small amounts that are not being displayed, thus giving you an empty matrix since FIND can't get an exact match.
One feature of floating point numbers is that certain numbers can't be exactly represented, since they aren't an integer power of 2. This occurs with the numbers 0.1 and 0.001. If you repeatedly add or multiply one of these numbers you can see some unexpected behavior. Amro pointed out one example in his comment: 0.3 is not exactly equal to 3*0.1. This can also be illustrated by creating your look-up list of latencies in two different ways. You can use the normal colon syntax:
vec1 = 0.001:0.001:0.5;
Or you can use LINSPACE:
vec2 = linspace(0.001,0.5,500);
You'd think these two vectors would be equal to one another, but think again!:
>> isequal(vec1,vec2)
ans =
0 %# FALSE!
This is because the two methods create the vectors by performing successive additions or multiplications of 0.001 in different ways, giving ever so slightly different values for some entries in the vector. You can take a look at this technical solution for more details.
When comparing floating point numbers, you should therefore do your comparisons using some tolerance. For example, this finds the indices of entries in the look-up list that are within 0.0001 of your actual latency:
tolerance = 0.0001;
for i = 1:length(actual_latency)
row = find(abs(predicted_data(:,1) - actual_latency(i)) < tolerance);
...
The topic of floating point comparison is also covered in this related question.
You may try to do the following:
row = find(abs(predicted_data(:,1) - actual_latency(i))) < eps)
EPS is accuracy of floating-point operation.
Have you tried using a tolerance rather than == ?