MATLAB: computations involving large numbers - matlab

How can one perform computations in MATLAB that involve large numbers. As a simple example, an arbitrary precision calculator would show that ((1/120)^132)*(370!)/(260!) is approximately 1.56, but MATLAB is not able to perform such a computation (power(120,-132)*factorial(370)/factorial(260) = NaN).
I have also tried the following, which does not work:
syms a b c d;
a=120; b=-132; c=370; d=260;
f=sym('power(a,b)*gamma(c+1)/gamma(d+1)')
double(f); % produces error that instructs use of `vpa`
vpa(f) % produces (gamma(c + 1.0)*power(a, b))/gamma(d + 1.0)

If you just want to calculate the factorial of some large numbers, you can use the Java arbitrary precision tools, like so:
result = java.math.BigDecimal(1);
for ix = 1:300
result = result.multiply(java.math.BigDecimal(ix));
end
disp(result)
306057512216440636035370461297268629388588804173576999416776741259476533176716867465515291422477573349939147888701726368864263907759003154226842927906974559841225476930271954604008012215776252176854255965356903506788725264321896264299365204576448830388909753943489625436053225980776521270822437639449120128678675368305712293681943649956460498166450227716500185176546469340112226034729724066333258583506870150169794168850353752137554910289126407157154830282284937952636580145235233156936482233436799254594095276820608062232812387383880817049600000000000000000000000000000000000000000000000000000000000000000000000000
The value result in this case is a java object. You can see the available methods here: http://docs.oracle.com/javase/6/docs/api/java/math/BigDecimal.html
I'm still not sure that I would trust this method for (1e6)! though. You'll have to experiment and see.

Depending on what you're trying to do, then you may be able to evaluate the expression you're interested in in log-space:
log_factorial = sum(log(1:300));

You can use Stirling's approximation to approximate large factorials and simplify your expression before computing it numerically.

This will work:
vpa('120^-132*370!/260!')
and the result is
1.5625098001612564605522837520443

Related

Matrix Multiplication Simplification not giving same results in code

I want to evaluate the expression xTAx -2yTAx+yTAy in Matlab.
If my calculations are correct, this should be equivalent to solving(x-y)TA(x-y)
My main problem right now is that I am not getting the same results from the above two expressions. I'm wondering if it's a precision error or a conceptual error.
I tried a very simple example with some random values to check my math was right. The mini example I used is as follows
A = [1,2,0;2,1,2;0,2,1];
x = [1;2;3];
y = [4;4;4];
(transpose(x)*A*x)-(2*transpose(y)*A*x)+(transpose(y)*A*(y))
transpose(x-y)*A*(x-y)
Both give 46.
I then tried a more realistic example of values for what I'm doing and it failed. For example
A = [2.66666666666667,-0.333333333333333,0,-0.333333333333333,-0.333333333333333,0,0,0,0;
-0.333333333333333,2.66666666666667,-0.333333333333333,-0.333333333333333,-0.333333333333333,-0.333333333333333,0,0,0;
0,-0.333333333333333,2.66666666666667,0,-0.333333333333333,-0.333333333333333,0,0,0;
-0.333333333333333,-0.333333333333333,0,2.66666666666667,-0.333333333333333,0,-0.333333333333333,-0.333333333333333,0;
-0.333333333333333,-0.333333333333333,-0.333333333333333,-0.333333333333333,2.66666666666667,-0.333333333333333,-0.333333333333333,-0.333333333333333 -0.333333333333333;
0,-0.333333333333333,-0.333333333333333,0,-0.333333333333333,2.66666666666667,0,-0.333333333333333,-0.333333333333333;
0,0,0,-0.333333333333333,-0.333333333333333,0,2.66666666666667,-0.333333333333333,0;
0,0,0,-0.333333333333333,-0.333333333333333,-0.333333333333333,-0.333333333333333,2.66666666666667,-0.333333333333333;
0,0,0,0,-0.333333333333333,-0.333333333333333,0,-0.333333333333333,2.66666666666667];
x =[1.21585420370805;
1.00388159497757e-16;
-0.405284734569351;
1.36809776609658e-16;
-1.04796659533634e-17;
-7.52459042423650e-17;
-0.607927101854027;
-8.49163704356314e-17;
0.303963550927013];
v =[0.0319067068305797,0.00786616506360615,0.0709811622828447,0.0719615328671117;
1.26150800194897e-17,5.77584497721720e-18,7.89740111567879e-18,7.14396333930938e-18;
-0.158358815125228,-0.876275686098803,0.0539216716399594,0.0450616819309899;
7.90937837037453e-18,3.24196519177793e-18,3.99402664932776e-18,4.17486202509670e-18;
5.35533279761622e-18,-8.91723780863019e-19,-1.56128480212603e-18,1.84423677629470e-19;
-2.18478057810330e-17,-6.63779320738873e-18,-3.21099714760257e-18,-3.93612240449303e-18;
-0.0213963226092681,-0.0168256143048771,-0.0175695110350900,-0.0128155908603601;
-4.06029341772399e-18,-5.65705978843172e-18,-1.80182480882056e-18,-1.59281757789645e-18;
0.221482525259710,-0.0576644539359728,0.0163934384910353,0.0197432976432437];
u = [1.37058079022968;
1.79302486419321;
69.4330887239959;
-52.3949662214410];
y = v*u;
Gives 0 for the first expression and 7.1387e-28 for the second. Is this a precision error thing? Which version is better/ more accurate to use and why? Thank you!
I have not checked your simplification, but floating-point math in general is inexact. Even doing the same operations in a different order can give you different results. With the deviations you are seeing I would believe they could reasonably come from inexactness in floating-point math and it is up to you to decide whether the deviations are acceptable for your purpose.
To determine which version is more accurate you would have to compute reference values to compare to, which might prove difficult.

Matlab symbolic

I am trying to compare two simple expressions using Matlab symbolic toolbox. For some reason, the code returns 0. Any idea ?
syms a b c
A = (a/b)^c
B = a^c/b^c
isequal(A,B)
It seems like MATLAB has a hard time telling that two expressions are the same when (potentially) fractional exponents are involved.
So, one solution, as suggested by Mikhail is to restrict the values of c to be only integers although, as discussed in the Math.SE question jodag posted, there is nothing wrong with fractional exponents in this case.
Hence, since this restriction to integers is not necessary for the statement to be true, another solution is to use simplify function on the expression for B but allowing it to run more simplification steps in order to get the most simplified expression.
syms a b c
A = (a/b)^c
B = a^c/b^c
isequal(A,simplify(B,'step',4))
Four steps is actually the smallest number that worked for me, but that could vary across versions of MATLAB I'm assuming. To be sure, I would include more, but for really large expressions, this could become computationally intensive, so some judgment is necessary. Note that, you could also use the 'Seconds' option to limit the amount of time allowed for simplification.
In general what you wrote isn't true, under the right "assumptions" it becomes true: for example, assuming c is an integer you can trick MATLAB into expanding A
clc; clear all;
syms a
syms b
syms c integer
A = (a/b)^c;
B = simplify((a^c)/(b^c));
disp(isequal(A,B));
disp(A);
disp(B);
1
a^c/b^c
a^c/b^c

How to force MATLAB to not compute as Inf

I am trying to work through some code that I wrote and one specific line is giving me problems in MATLAB:
Ts = (1+(DatesMod.*Bs)./VolMod).^(VolMod);
VolMod is an array with values on the order of 10^8, DatesMod has a range of values between 700,000 and 740,000, and Bs has a range of values between 0 and 100. Note that this function is mathematically similar to doing lim(n->Inf) (1+B*Dates/n)^n. I understand that this primarily has to do with the methods of allocating numbers on the computer. Is there a clever way I can force it to compute the actual value instead of returning Inf for every value?
Thanks in advance.
Note that the limit
lim(n->Inf) (1+B*Dates/n)^n = exp(B*Dates)
and that exp will overflow to Inf once the argument is greater than 709.9, so there is no real way to compute Ts exactly without arbitrary precision arithmetic.
The best option is probably work in log-precision, e.g. instead of Ts you work with logTs
logTs = VolMod .* log1p((DatesMod.*Bs)./VolMod)
You would then need to rewrite any subsequent expressions to use this without explicitly computing exp(logTs) (as that will overflow).

Create vector-valued function with arbitrary components

Good evening everyone,
I want to create a function
f(x) = [f1(x), f2(x), ... , fn(x)]
in MatLab, with an arbitrary form and number for the fi. In my current case they are meant to be basis elements for a finite-dimensional function space, so for example a number of multi variable polynomials. I want to able to be able to set form (e.g. hermite/lagrange polynomials, ...) and number via arguments in some sort of "function creating" function, so I would like to solve this for arbitrary functions fi.
Assume for now that the fi are fi:R^d -> R, so vector input to scalar output. This means the result from f should be a n-dim vector containing the output of all n functions. The number of functions n could be fairly large, as there is permutation involved. I also need to evaluate the resulting function very often, so I hope to do it as efficiently as possible.
Currently I see two ways to do this:
Create a cell with each fi using a loop, using something like
funcell{i}=matlabFunction(createpoly(degree, x),'vars',{x})
and one of the functions from the symbolic toolbox and a symbolic x (vector). It is then possible to create the desired function with cellfun, e.g.
f=#(x) cellfun(#(v) v(x), funcell)
This is relatively short, easy and what can be found when doing searches. It even allows extension to vector output using 'UniformOutput',false and cell2mat. On the downside it is very inefficient, first during creation because of matlabFunction and then during evaluation because of cellfun.
The other idea I had is to create a string and use eval. One way to do this would be
stringcell{i}=[char(createpoly(degree, x)),';']
and then use strjoin. In theory this should yield an efficient function. There are two problems however. The first is the use of eval (mostly on principle), the second is inserting the correct arguments. The symbolic toolbox does not allow symbols of the form x(i), so the resulting string will not contain them either. The only remedy I have so far is some sort of string replacement on the xi that are allowed, but this is also far from elegant.
So I do have ways to do what I need right now, but I would appreciate any ideas for a better solution.
From my understanding of the problem, you could do the straightforward:
Initialization step:
my_fns = cell(n, 1); %where n is number of functions
my_fns{1} = #f1; % Assuming f1 is defined in f1.m etc...
my_fns{2} = #f2;
Evaluation at x:
z = zeros(n, 1);
for i=1:n,
z(i) = my_fns{i}(x)
end
For example if you put it in my_evaluate.m:
function z = my_evaluate(my_fns, x)
z = zeros(n, 1);
for i=1:n,
z(i) = my_fns{i}(x)
end
How might this possibly be sped up?
Depends on if you have special structure than can be exploited.
Are there calculations common to some subset of f1 through fn that need not be repeated with each function call? Eg. if the common calculation step is costly, you could do y = f_helper(x) and z(i) = fi(x, y).
Can the functions f1...fn be vector / matrix friendly, allowing evaluation of multiple points with each function call?
The big issue is how fast your function calls f1 through fn are, not how you collect the results from those calls in a vector.

Replacement for repmat in MATLAB

I have a function which does the following loop many, many times:
for cluster=1:max(bins), % bins is a list in the same format as kmeans() IDX output
select=bins==cluster; % find group of values
means(select,:)=repmat_fast_spec(meanOneIn(x(select,:)),sum(select),1);
% (*, above) for each point, write the mean of all points in x that
% share its label in bins to the equivalent row of means
delta_x(select,:)=x(select,:)-(means(select,:));
%subtract out the mean from each point
end
Noting that repmat_fast_spec and meanOneIn are stripped-down versions of repmat() and mean(), respectively, I'm wondering if there's a way to do the assignment in the line labeled (*) that avoids repmat entirely.
Any other thoughts on how to squeeze performance out of this thing would also be welcome.
Here is a possible improvement to avoid REPMAT:
x = rand(20,4);
bins = randi(3,[20 1]);
d = zeros(size(x));
for i=1:max(bins)
idx = (bins==i);
d(idx,:) = bsxfun(#minus, x(idx,:), mean(x(idx,:)));
end
Another possibility:
x = rand(20,4);
bins = randi(3,[20 1]);
m = zeros(max(bins),size(x,2));
for i=1:max(bins)
m(i,:) = mean( x(bins==i,:) );
end
dd = x - m(bins,:);
One obvious way to speed up calculation in MATLAB is to make a MEX file. You can compile C code and perform any operations you want. If you're searching for the fastest-possible performance, turning the operation into a custom MEX file would likely be the way to go.
You may be able to get some improvement by using ACCUMARRAY.
%# gather array sizes
[nPts,nDims] = size(x);
nBins = max(bins);
%# calculate means. Not sure whether it might be faster to loop over nDims
meansCell = accumarray(bins,1:nPts,[nBins,1],#(idx){mean(x(idx,:),1)},{NaN(1,nDims)});
means = cell2mat(meansCell);
%# subtract cluster means from x - this is how you can avoid repmat in your code, btw.
%# all you need is the array with cluster means.
delta_x = x - means(bins,:);
First of all: format your code properly, surround any operator or assignment by whitespace. I find your code very hard to comprehend as it looks like a big blob of characters.
Next of all, you could follow the other responses and convert the code to C (mex) or Java, automatically or manually, but in my humble opinion this is a last resort. You should only do such things when your performance is not there yet by a small margin. On the other hand, your algorithm doesn't show obvious flaws.
But the first thing you should do when trying to improve performance: profile. Use the MATLAB profiler to determine which part of your code is causing your problems. How much would you need to improve this to meet your expectations? If you don't know: first determine this boundary, otherwise you will be looking for a needle in a hay stack which might not even be in there in the first place. MATLAB will never be the fastest kid on the block with respect to runtime, but it might be the fastest with respect to development time for certain kinds of operations. In that respect, it might prove useful to sacrifice the clarity of MATLAB over the execution speed of other languages (C or even Java). But in the same respect, you might as well code everything in assembler to squeeze all of the performance out of the code.
Another obvious way to speed up calculation in MATLAB is to make a Java library (similar to #aardvarkk's answer) since MATLAB is built on Java and has very good integration with user Java libraries.
Java's easier to interface and compile than C. It might be slower than C in some cases, but the just-in-time (JIT) compiler in the Java virtual machine generally speeds things up very well.