I am trying to figure out how to compute large powers of huge numbers in matlab to do
RSA encryption.
For example: A 50+ digit integer raised to the power of 999999.
You can use exponentiation by squaring:
https://en.wikipedia.org/wiki/Exponentiation_by_squaring
So the end result will be around 1e49^1e6 = 1e49000000. This is too large a number for any basic matlab datatype to hold. A solution is to use the vpi toolbox of the file exchange; it can handle large numbers, at the cost of speed.
A better solution would exist in getting your end objective on a different manner; ie redefine the formulas to get the final result..
We need to implement some form of large number data type
For C this is done using GMP Multiprecision library or LibToMMath Library
There are many others as well
For Matlab may be this will be helpful
>>> LInK <<<
Related
I am trying to turn off denormal number support in matlab, so that basically any two computations that would result in a denormal number would instead just result in zero (DAZ, FTZ)
I've researched several sites include the one below, but I haven't found anything about doing this.
http://blogs.mathworks.com/cleve/2014/07/21/floating-point-denormals-insignificant-but-controversial-2/
I've never heard of such an option in Matlab. It would likely require deep manipulation of a lot of the floating-point math, effectively requiring a new datatype to be supported if this were to be an easily toggle-able option in Matlab. You could write your own mex C code to do this (more here and here) for an individual function.
And of course you can get something like this with one line of Matlab – here's an example:
a = [1e-300 1e-310 1e-310];
b = [1e-301 1e-311 1e-310];
x = a-b;
x(abs(x(:)) < realmin(class(x))) = 0;
where realmin is the smallest normalized floating-point number. However, the floating point math is still performed using the extended denormal/subnormal values in a. It's just the output that's clipped to zero.
Unless you're doing this for fun an experimentation, or possibly running code on an embedded platform, I'd really recommend against disabling denormals as a form of optimization. Instead, focus on why your values are so small and how you might rescale your problem to avoid the issue entirely.
I encountered some problem while using Matlab. I'm doing some computations concerning OTC instruments (pricing, constructing discount curve, etc.), firstly in Excel and after that in Matlab (for comparison). While I`m 100% sure that computations in Excel are good (comparing to market data), it seems that Matlab is producing some differences (i.e. -4,18-05E). Matlab algorithm looks fine. I was wondering - maybe it is because Matlab is rounding some computations - I heard a little bit about it. I'm trying to convert a double numbers to float by function vpa(), but it looks that it is not working with double numbers. Any other ideas?
Excel uses 64 bit double precision floating point numbers compliant with IEEE 754 floating point specification.
The way that Excel treats results like =1/5 and appears to compute them exactly (despite this example not being a dyadic rational) is purely down to formatting. It handles =1/3 + 1/3 + 1/3 similarly. It's quite smart really if you think about it: the implementers of Excel had no real choice given that the average Excel user is not au fait with the finer points of floating point arithmetic and would simply scorn a spreadsheet package that "couldn't even get 1/5 correct".
That all said, you're very unlucky if you get a difference of -4,18-05E between the two systems. That's because double floating point is accurate to around 15 significant figures. Your algorithms would be implemented very poorly indeed for the error terms to bubble up to that magnitude if you're consistently using double precision floating point types.
Most likely (and I too work in finance), the difference will be in the way you're interpolating your discount curve. That's where I would look first if I were you.
Given the value of the error compared to the default format settings, this is almost certainly because of using the default format short and comparing the output on the command line to the real value.
x = 5.4444418
Output:
x =
5.4444
Then:
x-5.4444
Output:
ans =
4.1800e-05
The value stored in x remains at 5.4444418, it is only the measure output to the command line that changes.
I know matlab has a built in pdist function that will calculate pairwise distances. However, my matrix is so large that its 60000 by 300 and matlab runs out of memory.
This question is a follow up on Matlab euclidean pairwise square distance function.
Is there any workaround for this computational inefficiency. I tried manually coding the pairwise distance calculations and it usually takes a full day to run (sometimes 6 to 7 hours).
Any help is greatly appreciated!
Well, I couldn't resist playing around. I created a Matlab mex C file called pdistc that implements pairwise Euclidean distance for single and double precision. On my machine using Matlab R2012b and R2015a it's 20–25% faster than pdist(and the underlying pdistmex helper function) for large inputs (e.g., 60,000-by-300).
As has been pointed out, this problem is fundamentally bounded by memory and you're asking for a lot of it. My mex C code uses minimal memory beyond that needed for the output. In comparing its memory usage to that of pdist, it looks like the two are virtually the same. In other words, pdist is not using lots of extra memory. Your memory problem is likely in the memory used up before calling pdist (can you use clear to remove any large arrays?) or simply because you're trying to solve a big computational problem on tiny hardware.
So, my pdistc function likely won't be able to save you memory overall, but you may be able to use another feature I built in. You can calculate chunks of your overall pairwise distance vector. Something like this:
m = 6e3;
n = 3e2;
X = rand(m,n);
sz = m*(m-1)/2;
for i = 1:m:sz-m
D = pdistc(X', i, i+m); % mex C function, X is transposed relative to pdist
... % Process chunk of pairwise distances
end
This is considerably slower (10 times or so) and this part of my C code is not optimized well, but it will allow much less memory use – assuming that you don't need the entire array at one time. Note that you could do the same thing much more efficiently with pdist (or pdistc) by creating a loop where you passed in subsets of X directly, rather than all of it.
If you have a 64-bit Intel Mac, you won't need to compile as I've included the .mexmaci64 binary, but otherwise you'll need to figure out how to compile the code for your machine. I can't help you with that. It's possible that you may not be able to get it to compile or that there will be compatibility issues that you'll need to solve by editing the code yourself. It's also possible that there are bugs and the code will crash Matlab. Also, note that you may get slightly different outputs relative to pdist with differences between the two in the range of machine epsilon (eps). pdist may or may not do fancy things to avoid overflows for large inputs and other numeric issues, but be aware that my code does not.
Additionally, I created a simple pure Matlab implementation. It is massively slower than the mex code, but still faster than a naïve implementation or the code found in pdist.
All of the files can be found here. The ZIP archive includes all of the files. It's BSD licensed. Feel free to optimize (I tried BLAS calls and OpenMP in the C code to no avail – maybe some pointer magic or GPU/OpenCL could further speed it up). I hope that it can be helpful to you or someone else.
On my system the following is the fastest (Even faster than the C code pdistc by #horchler):
function [ mD ] = CalcDistMtx ( mX )
vSsqX = sum(mX .^ 2);
mD = sqrt(bsxfun(#plus, vSsqX.', vSsqX) - (2 * (mX.' * mX)));
end
You'll need a very well tuned C code to beat this, I think.
Update
Since MATLAB R2016b MATLAB supports implicit broadcasting without the use of bsxfun().
Hence the code can be written:
function [ mD ] = CalcDistMtx ( mX )
vSsqX = sum(mX .^ 2, 1);
mD = sqrt(vSsqX.'+ vSsqX - (2 * (mX.' * mX)));
end
A generalization is given in my Calculate Distance Matrix project.
P. S.
Using MATLAB's pdist for comparison: squareform(pdist(mX.')) is equivalent to CalcDistMtx(mX).
Namely the input should be transposed.
Computers are not infinitely large, or infinitely fast. People think that they have a lot of memory, a fast CPU, so they just create larger and larger problems, and then eventually wonder why their problem runs slowly. The fact is, this is NOT computational inefficiency. It is JUST an overloaded CPU.
As Oli points out in a comment, there are something like 2e9 values to compute, even assuming you only compute the upper or lower half of the distance matrix. (6e4^2/2 is approximately 2e9.) This will require roughly 16 gigabytes of RAM to store, assuming that only ONE copy of the array is created in memory. If your code is sloppy, you might easily double or triple that. As soon as you go into virtual memory, things get much slower.
Wanting a big problem to run fast is not enough. To really help you, we need to know how much RAM is available. Is this a virtual memory issue? Are you using 64 bit MATLAB, on a CPU that can handle all the needed RAM?
so I have the following Integral that i need to do numerically:
Int[Exp(0.5*(aCosx + bSinx + cCos2x + dSin2x))] x=0..2Pi
The problem is that the output at any given value of x can be extremely large, e^2000, so larger than I can deal with in double precision.
I havn't had much luck googling for the following, how do you deal with large numbers in fortran, not high precision, i dont care if i know it to beyond double precision, and at the end i'll just be taking the log, but i just need to be able to handle the large numbers untill i can take the log..
Are there integration packes that have the ability to handle arbitrarily large numbers? Mathematica clearly can.. so there must be something like this out there.
Cheers
This is probably an extended comment rather than an answer but here goes anyway ...
As you've already observed Fortran isn't equipped, out of the box, with the facility for handling such large numbers as e^2000. I think you have 3 options.
Use mathematics to reduce your problem to one which does (or a number of related ones which do) fall within the numerical range that your Fortran compiler can compute.
Use Mathematica or one of the other computer algebra systems (eg Maple, SAGE, Maxima). All (I think) of these can be integrated into a Fortran program (with varying degrees of difficulty and integration).
Use a library for high-precision (often called either arbitray-precision or multiple-precision too) arithmetic. Your favourite search engine will turn up a number of these for you, some written in Fortran (and therefore easy to integrate), some written in C/C++ or other languages (and therefore slightly harder to integrate). You might start your search at Lawrence Berkeley or the GNU bignum library.
(Yes I know that I wrote that you have 3 options, but your question suggests that you aren't ready to consider this yet) You could write your own high-/arbitrary-/multiple-precision functions. Fortran provides everything you need to construct such a library, there is a lot of work already done in the field to learn from, and it might be something of interest to you.
In practice it generally makes sense to apply as much mathematics as possible to a problem before resorting to a computer, that process can not only assist in solving the problem but guide your selection or construction of a program to solve what's left of the problem.
I agree with High Peformance Mark that the best option here numerically is to use analytics to scale or simplify the result first.
I will mention that if you do want to brute force it, gfortran (as of 4.6, with the libquadmath library) has support for quadruple precision reals, which you can use by selecting the appropriate kind. As long as your answers (and the intermediate results!) don't get too much bigger than what you're describing, that may work, but it will generally be much slower than double precision.
This requires looking deeper at the problem you are trying to solve and the behavior of the underlying mathematics. To add to the good advice already provided by Mark and Jonathan, consider expanding the exponential and trig functions into Taylor series and truncating to the desired level of precision.
Also, take a step back and ask why you are trying to accomplish by calculating this value. As an example, I recently had to debug why I was getting outlandish results from a property correlation which was calculating vapor pressure of a fluid to see if condensation was occurring. I spent a long time trying to understand what was wrong with the temperature being fed into the correlation until I realized the case causing the error was a simulation of vapor detonation. The problem was not in the numerics but in the logic of checking for condensation during a literal explosion; physically, a condensation check made no sense. The real problem was the code was asking an unnecessary question; it already had the answer.
I highly recommend Forman Acton's Numerical Methods That (Usually) Work and Real Computing Made Real. Both focus on problems like this and suggest techniques to tame ill-mannered computations.
I want to generate a large number of random numbers (uniformly distributed on the interval [0,1]). Currently the generation of these random numbers is causing my program to run quite slowly, however the program only needs them to be calculated to around 5 decimal places.
I'm not entirely sure of how MATLAB generates random numbers, but if there is a way of only calculating them to 5 decimal places then it will (hopefully greatly) speed up my program.
Is there a way of doing such a thing?
Thanks very much.
To answer your question, yes, you can generate single precision random numbers, like this:
r = rand(..., 'single'); %Reference: http://www.mathworks.com/help/matlab/ref/rand.html
Single precision numbers have 7 (ish) significant figures when printed as decimal.
To echo some comments above, I don't think this will buy you much performance. The first thing to do if rand is really your slow operation is to batch the calls. That is, instead of:
for ix 1:1000
y = rand(1,1,'single);
end
use:
yVector = rand(1000,1,'single');
As already mentioned, you can instruct RAND to generate numbers directly as single precision, and it's definitely best to generate the numbers in a decent sized chunk. If you still need more performance, and you have Parallel Computing Toolbox and a supported NVIDIA GPU, the gpuArray.rand function can be even faster, especially if you select the philox generator like so:
parallel.gpu.RandStream('Philox4x32-10')
Assuming you actually have a proper code layout where you generate a lot of numbers in an array, this can be a solution for low precision. Note that I have not tested but it is mentioned to be fast:
R = randi([0 100000],500,300)/100000
This will generate 150000 low precision random numbers between 0 and 1