I have a code that works as non linear system equation solver.
I have so much trouble with a command that goes like this:
newt[0]:=[-2.,20]:
I don't know what does that dot works there!
I thought it may be for showing that it is -2.0, but there is no reason to use that when by default -2 = -2.0.
Can anyone help me with this?
The dot forces float calculations
It is not correct that by default -2 = -2.0. There is a very big difference for Maple in how it calculates: if you use -2 it calculates exacts (arithmetic expressions) while -2.0 tells Maple to calculate with floats (numerical expressions).
The two expressions -2.*sqrt(5) and -2*sqrt(5.) are quite different in how Maple handles them, if you notice the float position! For the first example, the square root is calculated arithmetically, while in the second example it is calculated numerically!
This can be a very big deal for some calculations; both with regards to speed and precision, and should be considered carefully when one wants to do complicated computations.
Speed example: Calculate exp(x) for x = 1,2,...,50000. (Arithmetic > numerical)
CodeTools:-Usage(seq(exp(x),x=1..50000)): # Arithmetic
memory used=19.84MiB, alloc change=0 bytes, cpu time=875.00ms,
real time=812.00ms, gc time=265.62ms
CodeTools:-Usage(seq(exp(1.*x),x=1..50000)): # Numerical
memory used=292.62MiB, alloc change=0 bytes, cpu time=9.67s,
real time=9.45s, gc time=1.09s
Notice especially the huge difference in memory used.
This is an example of when using floats gives worse performance. On the contrary, if we are just approximating anyways, numerical approximation is much faster.
Approximate exp(1) (numerical > arithmetic)
CodeTools:-Usage(seq((1+1/x)^x,x=1..20000)): # Arithmetic
memory used=0.64GiB, alloc change=0 bytes, cpu time=39.05s,
real time=40.92s, gc time=593.75ms
CodeTools:-Usage(seq((1+1./x)^x,x=1..20000)): # Numerical
memory used=56.17MiB, alloc change=0 bytes, cpu time=1.06s,
real time=1.13s, gc time=0ns
Precision example: For precision, things can go very wrong if one is not careful.
f:=x->(Pi-x)/sin(x);
limit(f(x),x=Pi); # Arithmetic returns 1 (true value)
limit(f(x),x=Pi*1.); # Numerical returns 0 (wrong!!!)
After a little working with that I finally found what it does!
short answer: it calculate the result of expression where those 2 integers are inputs.
extended answer:(example)
given 2 functions, we want to calculate Jacobin matrix for this equation system
with(linalg);
with(plots);
f := proc (x, y) -> (1/64)*(x-11)^2-(1/100)*(y-7)^2-1;
g := proc (x, y) -> (x-3)^2+(y-1)^2-400;
then we put functions in vector:
F:=(x, y) -> vector([f(x,y),g(x,y)]);
F(-2 ,20)
F(-2.,20)
result will be this:
[-79/1600 -14]
[-0.049375000 -14]
Related
I noticed that SciPy has an implementation of the Discrete Sine Transform, and I was comparing it to the one that's in MATLAB. The MATLAB documentation notes that for best performance, the size of the inputs should be 2^p -1, presumably for a divide and conquer strategy. Is this also true for the SciPy implementation?
Although this question is old, I happen to have just ran some tests and then stumbled upon this question.
The answer is yes. Internally, scipy seems to converts the array to size M = 2*(N+1).
Ideally, M = 2^i, for some integer i. Therefore, N should follow N = 2^i - 1. The following picture shows how timings scale with fft-size. Note that the orange line is much smoother, indicating no unexpected memory overhead.
Green line: N = 2^i
Blue line: N = 2^i + 1
Orange line: N = 2^i - 1
UPDATE
After digging some more into the documentation of scipy.fftpack, I found that the above answer is only partly true. According to the documentation, "SciPy’s FFTPACK has efficient functions for radix {2, 3, 4, 5}". This means that instead of efficiently doing arrays of size M = 2^i, it can handle any M = 2^i * 3^j * 5^k (4 is not a prime). The optimum for scipy.fftpack.dst (or dct) is then M - 1. Finding those numbers can be a little awkward, but luckily there's a function for that, too!
Please note that the above graph is log-log scale, so speedups of 40 or so are not uncommon. Thus, choosing a fast size can make you calculations orders of magnitudes faster! (I found this out the hard way).
I am trying to calculate some integrals that use very high power exponents. An example equation is:
(-exp(-(x+sqrt(p)).^2)+exp(-(x-sqrt(p)).^2)).^2 ...
./( exp(-(x+sqrt(p)).^2)+exp(-(x-sqrt(p)).^2)) ...
/ (2*sqrt(pi))
where p is constant (1000 being a typical value), and I need the integral for x=[-inf,inf]. If I use the integral function for numeric integration I get NaN as a result. I can avoid that if I set the limits of the integration to something like [-20,20] and a low p (<100), but ideally I need the full range.
I have also tried setting syms x and using int and vpa, but in this case vpa returns:
1.0 - 1.0*numeric::int((1125899906842624*(exp(-(x - 10*10^(1/2))^2) - exp(-(x + 10*10^(1/2))^2))^2)/(3991211251234741*(exp(-(x - 10*10^(1/2))^2) + exp(-(x + 10*10^(1/2))^2)))
without calculating a value. Again, if I set the limits of the integration to lower values I do get a result (also for low p), but I know that the result that I get is wrong – e.g., if x=[-100,100] and p=1000, the result is >1, which should be wrong as the equation should be asymptotic to 1 (or alternatively the codomain should be [0,1) ).
Am I doing something wrong with vpa or is there another way to calculate high precision values for my integrals?
First, you're doing something that makes solving symbolic problems more difficult and less accurate. The variable pi is a floating-point value, not an exact symbolic representation of the fundamental constant. In Matlab symbolic math code, you should always use sym('pi'). You should do the same for any other special numeric values, e.g., sqrt(sym('2')) and exp(sym('1')), you use or they will get converted to an approximate rational fraction by default (the source of strange large number you see in the code in your question). For further details, I recommend that you read through the documentation for the sym function.
Applying the above, here's a runnable example:
syms x;
p = 1000;
f = (-exp(-(x+sqrt(p)).^2)+exp(-(x-sqrt(p)).^2)).^2./(exp(-(x+sqrt(p)).^2)...
+exp(-(x-sqrt(p)).^2))/(2*sqrt(sym('pi')));
Now vpa(int(f,x,-100,100)) and vpa(int(f,x,-1e3,1e3)) return exactly 1.0 (to 32 digits of precision, see below).
Unfortunately, vpa(int(f,x,-Inf,Inf)), does not return an answer, but a call to the underlying MuPAD function numeric::int. As I explain in this answer, this is what can happen when int cannot obtain a result. Normally, it should try to evaluate the the integral numerically, but your function appears to be ill-defined at ±∞, resulting in divide by zero issues that the variable precision quadrature methods can't handle well. You can evaluate the integral at wider bounds by increasing the variable precision using the digits function (just remember to set digits back to the default of 32 when done). Setting digits(128) allowed me to evaluate vpa(int(f,x,-1e4,1e4)). You can also more efficiently evaluate your integral over a wider range via 2*vpa(int(f,x,0,1e4)) at lower effective digits settings.
If your goal is to see exactly how much less than one p = 1000 corresponds to, you can use something like vpa(1-2*int(f,x,0,1e4)). At digits(128), this returns
0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000086457415971094118490438229708839420392402555445545519907545198837816908450303280444030703989603548138797600750757834260181259102
Applying double to this shows that it is approximately 8.6e-89.
I need to compute sin(4^x) with x > 1000 in Matlab, with is basically sin(4^x mod 2π) Since the values inside the sin function become very large, Matlab returns infinite for 4^1000. How can I efficiently compute this?
I prefer to avoid large data types.
I think that a transformation to something like sin(n*π+z) could be a possible solution.
You need to be careful, as there will be a loss of precision. The sin function is periodic, but 4^1000 is a big number. So effectively, we subtract off a multiple of 2*pi to move the argument into the interval [0,2*pi).
4^1000 is roughly 1e600, a really big number. So I'll do my computations using my high precision floating point tool in MATLAB. (In fact, one of my explicit goals when I wrote HPF was to be able to compute a number like sin(1e400). Even if you are doing something for the fun of it, doing it right still makes sense.) In this case, since I know that the power we are interested in is roughly 1e600, then I'll do my computations in more than 600 digits of precision, expecting that I'll lose 600 digits by the subtractive cancellation. This is a massive subtractive cancellation issue. Think about it. That modulus operation is effectively a difference between two numbers that will be identical for the first 600 digits or so!
X = hpf(4,1000);
X^1000
ans =
114813069527425452423283320117768198402231770208869520047764273682576626139237031385665948631650626991844596463898746277344711896086305533142593135616665318539129989145312280000688779148240044871428926990063486244781615463646388363947317026040466353970904996558162398808944629605623311649536164221970332681344168908984458505602379484807914058900934776500429002716706625830522008132236281291761267883317206598995396418127021779858404042159853183251540889433902091920554957783589672039160081957216630582755380425583726015528348786419432054508915275783882625175435528800822842770817965453762184851149029376
What is the nearest multiple of 2*pi that does not exceed this number? We can get that by a simple operation.
twopi = 2*hpf('pi',1000);
twopi*floor(X^1000/twopi)
ans = 114813069527425452423283320117768198402231770208869520047764273682576626139237031385665948631650626991844596463898746277344711896086305533142593135616665318539129989145312280000688779148240044871428926990063486244781615463646388363947317026040466353970904996558162398808944629605623311649536164221970332681344168908984458505602379484807914058900934776500429002716706625830522008132236281291761267883317206598995396418127021779858404042159853183251540889433902091920554957783589672039160081957216630582755380425583726015528348786419432054508915275783882625175435528800822842770817965453762184851149029372.6669043995793459614134256945369645075601351114240611660953769955068077703667306957296141306508448454625087552917109594896080531977700026110164492454168360842816021326434091264082935824243423723923797225539436621445702083718252029147608535630355342037150034246754736376698525786226858661984354538762888998045417518871508690623462425811535266975472894356742618714099283198893793280003764002738670747
As you can see, the first 600 digits were the same. Now, when we subtract the two numbers,
X^1000 - twopi*floor(X^1000/twopi)
ans =
3.333095600420654038586574305463035492439864888575938833904623004493192229633269304270385869349155154537491244708289040510391946802229997388983550754583163915718397867356590873591706417575657627607620277446056337855429791628174797085239146436964465796284996575324526362330147421377314133801564546123711100195458248112849130937653757418846473302452710564325738128590071680110620671999623599726132925263826
This is why I referred to it as a massive subtractive cancellation issue. The two numbers were identical for many digits. Even carrying 1000 digits of accuracy, we lost many digits. When you subtract the two numbers, even though we are carrying a result with 1000 digits, only the highest order 400 digits are now meaningful.
HPF is able to compute the trig function of course. But as we showed above, we should only trust roughly the first 400 digits of the result. (On some problems, the local shape of the sin function might cause us to lose more digits than that.)
sin(X^1000)
ans =
-0.1903345812720831838599439606845545570938837404109863917294376841894712513865023424095542391769688083234673471544860353291299342362176199653705319268544933406487071446348974733627946491118519242322925266014312897692338851129959945710407032269306021895848758484213914397204873580776582665985136229328001258364005927758343416222346964077953970335574414341993543060039082045405589175008978144047447822552228622246373827700900275324736372481560928339463344332977892008702220160335415291421081700744044783839286957735438564512465095046421806677102961093487708088908698531980424016458534629166108853012535493022540352439740116731784303190082954669140297192942872076015028260408231321604825270343945928445589223610185565384195863513901089662882903491956506613967241725877276022863187800632706503317201234223359028987534885835397133761207714290279709429427673410881392869598191090443394014959206395112705966050737703851465772573657470968976925223745019446303227806333289071966161759485260639499431164004196825
So am I right, and we cannot trust all of these digits? I'll do the same computation, once in 1000 digits of precision, then a second time in 2000 digits. Compute the absolute difference, then take the log10. The 2000 digit result will be our reference as essentially exact compared to the 1000 digit result.
double(log10(abs(sin(hpf(4,[1000 0])^1000) - sin(hpf(4,[2000 0])^1000))))
ans =
-397.45
Ah. So of those 1000 digits of precision we started out with, we lost 602 digits. The last 602 digits in the result are non-zero, but still complete garbage. This was as I expected. Just because your computer reports high precision, you need to know when not to trust it.
Can we do the computation without recourse to a high precision tool? Be careful. For example, suppose we use a powermod type of computation? Thus, compute the desired power, while taking the modulus at every step. Thus, done in double precision:
X = 1;
for i = 1:1000
X = mod(X*4,2*pi);
end
sin(X)
ans =
0.955296299215251
Ah, but remember that the true answer was -0.19033458127208318385994396068455455709388...
So there is essentially nothing of significance remaining. We have lost all our information in that computation. As I said, it is important to be careful.
What happened was after each step in that loop, we incurred a tiny loss in the modulus computation. But then we multiplied the answer by 4, which caused the error to grow by a factor of 4, and then another factor of 4, etc. And of course, after each step, the result loses a tiny bit at the end of the number. The final result was complete crapola.
Lets look at the operation for a smaller power, just to convince ourselves what happened. Here for example, try the 20th power. Using double precision,
mod(4^20,2*pi)
ans =
3.55938555711037
Now, use a loop in a powermod computation, taking the mod after every step. Essentially, this discards multiples of 2*pi after each step.
X = 1;
for i = 1:20
X = mod(X*4,2*pi);
end
X
X =
3.55938555711037
But is that the correct value? Again, I'll use hpf to compute the correct value, showing the first 20 digits of that number. (Since I've done the computation in 50 total digits, I'll absolutely trust the first 20 of them.)
mod(hpf(4,[20,30])^20,2*hpf('pi',[20,30]))
ans =
3.5593426962577983146
In fact, while the results in double precision agree to the last digit shown, those double results were both actually wrong past the 5th significant digit. As it turns out, we STILL need to carry more than 600 digits of precision for this loop to produce a result of any significance.
Finally, to fully kill this dead horse, we might ask if a better powermod computation can be done. That is, we know that 1000 can be decomposed into a binary form (use dec2bin) as:
512 + 256 + 128 + 64 + 32 + 8
ans =
1000
Can we use a repeated squaring scheme to expand that large power with fewer multiplications, and so cause less accumulated error? Essentially, we might try to compute
4^1000 = 4^8 * 4^32 * 4^64 * 4^128 * 4^256 * 4^512
However, do this by repeatedly squaring 4, then taking the mod after each operation. This fails however, since the modulo operation will only remove integer multiples of 2*pi. After all, mod really is designed to work on integers. So look at what happens. We can express 4^2 as:
4^2 = 16 = 3.43362938564083 + 2*(2*pi)
Can we just square the remainder however, then taking the mod again? NO!
mod(3.43362938564083^2,2*pi)
ans =
5.50662545075664
mod(4^4,2*pi)
ans =
4.67258771281655
We can understand what happened when we expand this form:
4^4 = (4^2)^2 = (3.43362938564083 + 2*(2*pi))^2
What will you get when you remove INTEGER multiples of 2*pi? You need to understand why the direct loop allowed me to remove integer multiples of 2*pi, but the above squaring operation does not. Of course, the direct loop failed too because of numerical issues.
I would first redefine the question as follows: compute 4^1000 modulo 2pi. So we have split the problem in two.
Use some math trickery:
(a+2pi*K)*(b+2piL) = ab + 2pi*(garbage)
Hence, you can just multiply 4 many times by itself and computing mod 2pi every stage. The real question to ask, of course, is what is the precision of this thing. This needs careful mathematical analysis. It may or may not be a total crap.
Following to Pavel's hint with mod I found a mod function for high powers on mathwors.com.
bigmod(number,power,modulo) can NOT compute 4^4000 mod 2π. Because it just works with integers as modulo and not with decimals.
This statement is not correct anymore: sin(4^x) is sin(bidmod(4,x,2*pi)).
What is a tight lower-bound on the size of the set of irrational numbers, N, expressed as doubles in Matlab on a 64-bit machine, that I multiply together while having confidence in k decimal digits of the product? What precision, for example could I expect after multiplying together ~10^12 doubles encoding different random chunks of pi?
If you ask for tight bound, the response of #EricPostpischil is the absolute error bound if all operations are performed in IEEE 754 double precision.
If you ask for confidence, I understand it as statistical distribution of errors. Assuming a uniform distribution of error in [-e/2,e/2] you could try to ask for theoretical distribution of error after M operations on math stack exchange... I guess the tight bound is somehow very conservative.
Let's illustrate an experimental estimation of those stats with some Smalltalk code (any language having large integer/fraction arithmetic could do):
nOp := 500.
relativeErrorBound := ((1 + (Float epsilon asFraction / 2)) raisedTo: nOp * 2 - 1) - 1.0.
nToss := 1000.
stats := (1 to: nToss)
collect: [:void |
| fractions exactProduct floatProduct relativeError |
fractions := (1 to: nOp) collect: [:e | 10000 atRandom / 3137].
exactProduct := fractions inject: 1 into: [:prod :element | prod * element].
floatProduct := fractions inject: 1.0 into: [:prod :element | prod * element].
relativeError := (floatProduct asFraction - exactProduct) / exactProduct.
relativeError].
s1 := stats detectSum: [:each | each].
s2 := stats detectSum: [:each | each squared].
maxEncounteredError := (stats detectMax: [:each | each abs]) abs asFloat.
estimatedMean := (s1 /nToss) asFloat.
estimatedStd := (s2 / (nToss-1) - (s1/nToss) squared) sqrt.
I get these results for multiplication of nOp=20 double:
relativeErrorBound -> 4.440892098500626e-15
maxEncounteredError -> 1.250926201710214e-15
estimatedMean -> -1.0984634797115124e-18
estimatedStd -> 2.9607828266493842e-16
For nOp=100:
relativeErrorBound -> 2.220446049250313e-14
maxEncounteredError -> 2.1454964094158273e-15
estimatedMean -> -1.8768492273800676e-17
estimatedStd -> 6.529482793500846e-16
And for nOp=500:
relativeErrorBound -> 1.1102230246251565e-13
maxEncounteredError -> 4.550696454362764e-15
estimatedMean -> 9.51007740905571e-17
estimatedStd -> 1.4766176010100097e-15
You can observe that the standard deviation growth is much more slow than that of error bound.
UPDATE: at first approximation (1+e)^m = 1+m*e+O((m*e)^2), so the distribution is approximately a sum of m uniform in [-e,e] as long as m*e is small enough, and this sum is very near a normal distribution (gaussian) of variance m*(2e)^2/12. You can check that std(sum(rand(100,5000))) is near sqrt(100/12) in Matlab.
We can consider it is still true for m=2*10^12-1, that is approximately m=2^41, m*e=2^-12. In which case, the global error is a quasi normal distribution and the standard deviation of global error is sigma=(2^-52*sqrt(2^41/12)) or approximately sigma=10^-10
See http://en.wikipedia.org/wiki/Normal_distribution to compute P(abs(error)>k*sigma)
In 68% of case (1 sigma), you'll have 10 digits of precision or more.
erfc(10/sqrt(2)) gives you the probability to have less than 9 digits of precision, about 1 case out of 6*10^22, so I let you compute the probability of having only 4 digits of precision (you can't evaluate it with double precision, it underflows) !!!
My experimental standard deviation were a bit smaller than theoretical ones (2e-15 9e-16 4e-16 for 20 100 & 500 double) but this must be due to a biased distriution of my inputs errors i/3137 i=1..10000...
That's a good way to remind that the result will be dominated by the distribution of errors in your inputs, which might exceed e if they result from floating point operations like M_PI*num/den
Also, as Eric said, using only * is quite an ideal case, things might degenerate quicker if you mix +.
Last note: we can craft a list of inputs that reach the maximum error bound, set all elements to be (1+e) which will be rounded to 1.0, and we get the maximum theoretical error bound, but our input distribution is quite biased! HEM WRONG since all multiplication are exact we get only (1+e)^n, not (1+e)^(2n-1), so about only half the error...
UPDATE 2: the inverse problem
Since you want the inverse, what is the length n of sequence such that I get k digits of precision with a certain level of confidence 10^-c
I'll answer only for k>=8, because (m*e) << 1 is required in above approximations.
Let's take c=7, you get k digits with a confidence of 10^-7 means 5.3*sigma < 10^-k.
sigma = 2*e*sqrt((2*n-1)/12) that is n=0.5+1.5*(sigma/e)^2 with e=2^-53.
Thus n ~ 3*2^105*sigma^2, as sigma^2 < 10^-2k/5.3^2, we can write n < 3*2^105*10^-(2*k)/5.3^2
A.N. the probability of having less than k=9 digits is less than 10^-7 for a length n=4.3e12, and around n=4.3e10 for 10 digits.
We would reach n=4 numbers for 15 digits, but here our normal distribution hypothesis is very rough and does not hold, especially distribution tail at 5 sigmas, so use with caution (Berry–Esseen theorem bounds how far from normal is such distribution http://en.wikipedia.org/wiki/Berry-Esseen_theorem )
The relative error in M operations as described is at most (1+2-53)M-1, assuming all input, intermediate, and final values do not underflow or overflow.
Consider converting a real number a0 to double precision. The result is some number a0•(1+e), where -2-53 ≤ e ≤ 2-53 (because conversion to double precision should always produce the closest representable value, and the quantum for double precision values is 2-53 of the highest bit, and the closest value is always within half a quantum). For further analysis, we will consider the worst case value of e, 2-53.
When we multiply one (previously converted) value by another, the mathematically exact result is a0•(1+e) • a1•(1+e). The result of the calculation has another rounding error, so the calculated result is a0•(1+e) • a1•(1+e) • (1+e) = a0 • a1 • (1+e)3. Obviously, this is a relative error of (1+e)3. We can see the error accumulates simply as (1+e)M for these operations: Each operation multiplies all previous error terms by 1+e.
Given N inputs, there will be N conversions and N-1 multiplications, so the worst error will be (1+e)2 N - 1.
Equality for this error is achieved only for N≤1. Otherwise, the error must be less than this bound.
Note that an error bound this simple is possible only in a simple problem, such as this one with homogeneous operations. In typical floating-point arithmetic, with a mixture of addition, subtraction, multiplication, and other operations, computing a bound so simply is generally not possible.
For N=1012 (M=2•1012-1), the above bound is less than 2.000222062•1012 units of 2-53, and is less than .0002220693. So the calculated result is good to something under four decimal digits. (Remember, though, you need to avoid overflow and underflow.)
(Note on the strictness of the above calculation: I used Maple to calculate 1000 terms of the binomial (1+2-53)2•1012-1 exactly (having removed the initial 1 term) and to add a value that is provably larger than the sum of all remaining terms. Then I had Maple evaluate that exact result to 1000 decimal digits, and it was less than the bound I report above.)
For 64 bit floating point numbers, assuming the standard IEEE 754, has 52+1 bits of mantissa.
That means relative precision is between 1.0000...0 and 1.0000...1, where the number of binary digits after the decimal point is 52. (You can think of the 1.000...0 as what is stored in binary in the mantissa AKA significand).
The error is 1/2 to the power of 52 divided by 2 (half the resolution). Note I choose the relative precision as close to 1.0 as possible, because it is the worst case (otherwise between 1.111..11 and 1.111..01, it is more precise).
In decimal, the worst case relative precision of a double is 1.11E-16.
If you multiply N doubles with this precision, the new relative precision (assuming no additional error due to intermediate rounding) is:
1 - (1 - 1.11E-16)^N
So if you multiply pi (or any double 10^12) times, the upper bound on error is:
1.1102e-004
That means you can have confidence in about 4-5 digits.
You can ignore intermediate rounding error if your CPU has support for extended precision floating point numbers for intermediate results.
If there is no extended precision FPU (floating point unit) used, rounding in intermediate steps introduces additional error (same as due to multiplication). That means that a strict lower bound calculated as:
1 -
((1 - 1.11E-16) * (1 - 1.11E-16) * (1 - 1.11E-16)
* (1 - 1.11E-16) * (1 - 1.11E-16) % for multiplication, then rounding
... (another N-4 lines here) ...
* (1 - 1.11E-16) * (1 - 1.11E-16))
= 1-(1-1.11E-16)^(N*2-1)
If N is too large, it takes too long to run. The possible error (with intermediate rounding) is 2.2204e-012, which is double compared to without intermediate rounding 1-(1 - 1.11E-16)^N=1.1102e-012.
Approximately, we can say that intermediate rounding doubles the error.
If you multiplied pi 10^12 times, and there was no extended precision FPU. This might be because you write intermediate steps to memory (and perhaps do something else), before continuing (just make sure the compiler hasn't reordered your instructions so that there is no FPU result accumulation), then a strict upper bound on your relative error is:
2.22e-004
Note that confidence in decimal places doesn't mean it will be exactly that decimal places sometimes.
For example, if the answer is:
1.999999999999, and the error is 1E-5, the actual answer could be 2.000001234.
In this case, even the first decimal digit was wrong. But that really depends on how lucky you are (whether the answer falls on a boundary such as this).
This solution assumes that the doubles (including the answer) are all normalized. For denormalized results, obviously, the number binary digits by which it is denormalized will reduce the accuracy by that many digits.
MATLAB does not satisfy matrix arithmetic for inverse, that is;
(ABC)-1 = C-1 * B-1 * A-1
in MATLAB,
if inv(A*B*C) == inv(C)*inv(B)*inv(A)
disp('satisfied')
end
It does not qualify. When I made it format long, I realized that there is difference in points, but it even does not satisfy when I make it format rat.
Why is that so?
Very likely a floating point error. Note that the format function affects only how numbers display, not how MATLAB computes or saves them. So setting it to rat won't help the inaccuracy.
I haven't tested, but you may try the Fractions Toolbox for exact rational number arithmetics, which should give an equality to above.
Consider this (MATLAB R2011a):
a = 1e10;
>> b = inv(a)*inv(a)
b =
1.0000e-020
>> c = inv(a*a)
c =
1.0000e-020
>> b==c
ans =
0
>> format hex
>> b
b =
3bc79ca10c924224
>> c
c =
3bc79ca10c924223
When MATLAB calculates the intermediate quantities inv(a), or a*a (whether a is a scalar or a matrix), it by default stores them as the closest double precision floating point number - which is not exact. So when these slightly inaccurate intermediate results are used in subsequent calculations, there will be round off error.
Instead of comparing floating point numbers for direct equality, such as inv(A*B*C) == inv(C)*inv(B)*inv(A), it's often better to compare the absolute difference to a threshold, such as abs(inv(A*B*C) - inv(C)*inv(B)*inv(A)) < thresh. Here thresh can be an arbitrary small number, or some expression involving eps, which gives you the smallest difference between two numbers at the precision at which you're working.
The format command only controls the display of results at the command line, not the way in which results are internally stored. In particular, format rat does not make MATLAB do calculations symbolically. For this, you might take a look at the Symbolic Math Toolbox. format hex is often even more useful than format long for diagnosing floating point precision issues such as the one you've come across.