I have a program in which
e=0.01;
test3=Ask(1,2)-Bid(1,2);
(Bid and Ask are both matrices generated by a for loop).
I know that sometimes test3 could equal to e, but matlab gave me this:
>>test3
e
test3==e
test3>e
test3 =
0.0100
e =
0.0100
ans =
0
ans =
1
What's wrong with it? Thank you!
EDIT: I tried format long and then I got
Ask(1,2) =
8.620000000000001
Bid(1,2) =
8.609999999999999
test3 =
0.010000000000002
no wonder I got it wrong. But actually I've already use price=roundn(r*v,-2) and both Ask(1,2) and Bid(1,2) equal to some price so they should have two decimal places only. What can I do now if I want to round them to exactly two decimal places? Thanks again!
In general, roundoff error makes it difficult to ever compare numbers in floating point.
0.1 is a nice decimal number 1/10, but stored on a computer in binary it is a repeating fraction and not stored exactly.
So just for example:
x = 0.2;
y = 0.1 + 0.1;
x == y
will not be true.
[Well, unfortunately, as rwong points out below, this actually isn't true. I should have tried it! Octave is being a bit too smart for me at this hour. Still, in general, there will be roundoff!]
Sometimes the error will be big enough to see in the 16th digit, which is why you got the comment to try format long. But sometimes it might not be visible. The bottom line is:
NEVER USE == for two decimal numbers. It is almost always false, and usually that's meaningless.
what you want is to test if two numbers are very close to each other, which is:
abs(x-y) < 0.00001
for some small limit.
If test3 = 0.01000000000000001 or some such and e = 0.01 then you are likely to get this sort of result what happens when you print test3-e will give you a clue - this is why in the safety critical field float == float is always something that raises an alarm bell with auditors.
Related
I am trying to figure out how to right a math based app with Matlab, although I cannot seem to figure out how to get the Monte Carlo method of integration to work. I feel that I do not have algorithm thought out correctly either. As of now, I have something like:
// For the function {integral of cos(x^3)*exp(x^(1/2))+x dx
// from x = 0 to x = 10
ans = 0;
for i = 1:100000000
x = 10*rand;
ans = ans + cos(x^3)*exp(x^(1/2))+x
end
I feel that this is completely wrong because my outputs are hardly even close to what is expected. How should I correctly write this? Or, how should the algorithm for setting this up look?
Two issues:
1) If you look at what you're calculating, "ans" is going to grow as i increases. By putting a huge number of samples, you're just increasing your output value. How could you normalize this value so that it stays relatively the same, regardless of number of samples?
2) Think about what you're trying to calculate here. Your current "ans" is giving you the sum of 100000000 independent random measurements of the output to your function. What does this number represent if you divide by the number of samples you've taken? How could you combine that knowledge with the range of integration in order to get the expected area under the curve?
I managed to solve this with the formula I found here. I ended up using:
ans = 0;
n = 0;
for i:1:100000000
x = 10*rand;
n = n + cos(x^3)*exp(x^(1/2))+x;
end
ans = ((10-0)/100000000)*n
I want to calculate the percentage of accuracy. I have the code below. But it give unexpected output like this "The accuracy is 2.843137e+01x37".
While expected result is "The accuracy is 28.43%"
y %Amount of correct data
j %Amount of all data
a = 'The accuracy is %dx%d.';
percent = '%.0f%%';
format short
acc = 100 * double(y/j);
sprintf (a,acc)
How to fix it?
Any help would be so much appreciated.
Thank you.
You almost have what you expected, just put it together the right way.
The correct format specifier for 28.43% is %.2f%%. This gives you two digits after the decimal point and adds the %-sign at the end. You have that defined in the variable percent, except that .0 should be .2 for two digits as you have written in the expected result. If you look closely, you'll realize that percent is never used.
Let's come to the conclusion. Change the format specifier to the following:
a = 'The accuracy is %.2f%%';
That's all you need to do. The line defining percent can be omitted as well as format short unless you need this for something later on.
Something important regarding the cast to double: What you currently have just casts the result. If necessary, do the cast individually to y and/or j before the division. Probably you don't need any casting in your case.
The whole code with an assumption for y and j is:
y = 28.43137; %// Amount of correct data
j = 100; %// Amount of all data
a = 'The accuracy is %.2f%%';
acc = 100 * (y/j); %// no cast
% acc = 100 * (double(y)/double(j)); %// with cast
sprintf(a,acc);
Output:
ans =
The accuracy is 28.43%
Try,
a = 'The accuracy is %f.';
acc = 100 * double(y/j);
sprintf (a,acc)
First, the data:
orig = reshape([0.0000000000000000 0.3480000000000000 0.7570000000000000 1.3009999999999999 2.8300000000000001 4.7519999999999998 5.2660000000000000 5.8120000000000003 14.3360000000000000 15.3390000000000000 ],[10 1])
change = reshape([0.0000000000000000 0.3480000000000000 0.0000000000000000 0.9530000000000000 1.5290000000000001 1.9219999999999997 0.5140000000000002 0.5460000000000003 0.0000000000000000 9.5270000000000010 ],[10 1])
change = cumsum(change)
orig is a vector of seconds elapsed. change is a vector derived by taking differences between (some) elements of orig. The cumulative sum of change has some elements actually equal to the corresponding element in orig.
However, due to precision issues:
diff = orig - change
gives
diff =
0
0
0.409
0
0
0
0
0
8.524
-1.77635683940025e-15
It seems that if I run the following command:
diff(abs(diff) <= eps(orig)) = 0
then this sets entries which should be zero, but are not due to precision issues, to be zero.
My question is, is this the correct way to do it? Why is the comparison <= instead of <? Should the statement be:
diff(abs(diff) < k*eps(orig)) = 0
for some k > 1 to give some tolerance? If so, how would one pick k?
In case it is necessary to know how change is derived from orig, the following alternate example also shows this behaviour:
orig = reshape([0.0000000000000000 0.3480000000000000 0.7570000000000000 1.3009999999999999 2.8300000000000001 4.7519999999999998 5.2660000000000000 5.8120000000000003 14.3360000000000000 15.3390000000000000 ],[10 1])
change = orig - [0; orig(1:end-1)]
change = cumsum(change)
diff = orig - change
The following statement will be true only if the "almost zero" happens because 1 bit is offseted.
abs(diff) <= eps(orig)
1 bit is a ridiculously high precision to ask, a precision that most likely you can not achieve. Generally, you need to define your treshold yourself, such as
abs(diff) <= 1e-12
You also ask how to choose this value. Answer: there is no way we can tell you that. Its algorithm, application, unit, computer, [...] specific.
You are computing distance between particles? Maybe you need a smaller tolerance. You are doing economic profit calculus? 1e-12 is then a decimal you won't get in cash, for sure. Use 1e-4 instead. Or are you using an algorithm that does numerical approximations? Then you need a higher tolerance. How much tolerance you are OK with is, and will always be, a user choice.
Note: you need to be aware of the types you are using to set this minimum threshold right. MATLAB uses double as default, but if you are using other types, them this threshold is too strict. As an alternative, you can use
abs(diff) <= 100*eps(class(diff))
If your data type is not fixed/known.
Let's say I have a random variable a=1.2400, and I want to print it with four significant figures, i.e., 1.240. How would I go about that?
fprintf('%0.4g',a) % drops rightmost zero
fprintf('%0.3f',a) % give too many sig figs if a >= 10
Using '%g' drops the important zeros, and with '%f' I can only specify the number of digits after the decimal, which results in too many significant figures if, say, a=10.04. I'm not too familiar with formatting ,but there has to be a simple method. I haven't found it in my searches.
If the values to be printed are all less than 10000, you can do the following. (Sorry, only tested in octave.)
octave:62> a = 1.24
a = 1.2400
octave:63> sprintf('%.*f\n', 3-floor(log10(abs(a))), a)
ans = 1.240
octave:64> a = 234.56
a = 234.56
octave:65> sprintf('%.*f\n', 3-floor(log10(abs(a))), a)
ans = 234.6
For more about the expression floor(log10(abs(a))), see How can I get the exponent of each number in a np.array?
If you don't mind exponential notation, another alternative is to use '%.3e' to always get the same number of signficant digits:
octave:70> a = 1.24
a = 1.2400
octave:71> sprintf('%.3e\n', a)
ans = 1.240e+00
octave:72> a = 234.56
a = 234.56
octave:73> sprintf('%.3e\n', a)
ans = 2.346e+02
I decided to build on the answer by Warren, and I wrote a function that should work for both small and large numbers alike. Perhaps someone will improve on this, but I am pleased with it.
function str=sigfigstr(a,sigfigs)
numdecimal = floor(log10(abs(a)));
if sigfigs - numdecimal < 0
str=sprintf('%.0f',round(a,sigfigs,'significant'));
else
str=strip(sprintf('%.*f\n', sigfigs-floor(log10(abs(a))), a));
end
Here are a few examples if it in action in Matlab
>> sigfigstr(.000012431634,3)
ans = '0.0000124'
>> sigfigstr(26666,3)
ans = '26700'
Does anyone know how to make the following Matlab code approximate the exponential function more accurately when dealing with large and negative real numbers?
For example when x = 1, the code works well, when x = -100, it returns an answer of 8.7364e+31 when it should be closer to 3.7201e-44.
The code is as follows:
s=1
a=1;
y=1;
for k=1:40
a=a/k;
y=y*x;
s=s+a*y;
end
s
Any assistance is appreciated, cheers.
EDIT:
Ok so the question is as follows:
Which mathematical function does this code approximate? (I say the exponential function.) Does it work when x = 1? (Yes.) Unfortunately, using this when x = -100 produces the answer s = 8.7364e+31. Your colleague believes that there is a silly bug in the program, and asks for your assistance. Explain the behaviour carefully and give a simple fix which produces a better result. [You must suggest a modification to the above code, or it's use. You must also check your simple fix works.]
So I somewhat understand that the problem surrounds large numbers when there is 16 (or more) orders of magnitude between terms, precision is lost, but the solution eludes me.
Thanks
EDIT:
So in the end I went with this:
s = 1;
x = -100;
a = 1;
y = 1;
x1 = 1;
for k=1:40
x1 = x/10;
a = a/k;
y = y*x1;
s = s + a*y;
end
s = s^10;
s
Not sure if it's completely correct but it returns some good approximations.
exp(-100) = 3.720075976020836e-044
s = 3.722053303838800e-044
After further analysis (and unfortunately submitting the assignment), I realised increasing the number of iterations, and thus increasing terms, further improves efficiency. In fact the following was even more efficient:
s = 1;
x = -100;
a = 1;
y = 1;
x1 = 1;
for k=1:200
x1 = x/200;
a = a/k;
y = y*x1;
s = s + a*y;
end
s = s^200;
s
Which gives:
exp(-100) = 3.720075976020836e-044
s = 3.720075976020701e-044
As John points out in a comment, you have an error inside the loop. The y = y*k line does not do what you need. Look more carefully at the terms in the series for exp(x).
Anyway, I assume this is why you have been given this homework assignment, to learn that series like this don't converge very well for large values. Instead, you should consider how to do range reduction.
For example, can you use the identity
exp(x+y) = exp(x)*exp(y)
to your advantage? Suppose you store the value of exp(1) = 2.7182818284590452353...
Now, if I were to ask you to compute the value of exp(1.3), how would you use the above information?
exp(1.3) = exp(1)*exp(0.3)
But we KNOW the value of exp(1) already. In fact, with a little thought, this will let you reduce the range for an exponential down to needing the series to converge rapidly only for abs(x) <= 0.5.
Edit: There is a second way one can do range reduction using a variation of the same identity.
exp(x) = exp(x/2)*exp(x/2) = exp(x/2)^2
Thus, suppose you wish to compute the exponential of large number, perhaps 12.8. Getting this to converge acceptably fast will take many terms in the simple series, and there will be a great deal of subtractive cancellation happening, so you won't get good accuracy anyway. However, if we recognize that
12.8 = 2*6.4 = 2*2*3.2 = ... = 16*0.8
then IF you could efficiently compute the exponential of 0.8, then the desired value is easy to recover, perhaps by repeated squaring.
exp(12.8)
ans =
362217.449611248
a = exp(0.8)
a =
2.22554092849247
a = a*a;
a = a*a;
a = a*a;
a = a*a
362217.449611249
exp(0.8)^16
ans =
362217.449611249
Note that WHENEVER you do range reduction using methods like this, while you may incur numerical problems due to the additional computations necessary, you will usually come out way ahead due to the greatly enhanced convergence of your series.
Why do you think that's the wrong answer? Look at the last term of that sequence, and it's size, and tell me why you expect you should have an answer that's close to 0.
My original answer stated that roundoff error was the problem. That will be a problem with this basic approach, but why do you think 40 is enough terms for the appropriate mathematical ( as opposed to computer floating point arithmetic) answer.
100^40 / 40! ~= 10^31.
Woodchip has the right idea with range reduction. That's the typical approach people use to implement these kinds of functions very quickly. Once you get that all figured out, you deal with roundoff errors of alternating sequences, by summing adjacent terms within the loop, and stepping with k = 1 : 2 : 40 (for instance). That doesn't work here until you use woodchips's idea because for x = -100, the summands grow for a very long time. You need |x| < 1 to guarantee intermediate terms are shrinking, and thus a rewrite will work.