I am writing a program to decrypt RSA-encrypted messages by factorising the prime number and I want to test it on some long primes so I need to use Variable Precision Arithmetic, but for some reason even when I use small numbers it is significantly slower when I use VPA with a small number of digits. What is going on?
Related
I am currently learning how to use SEAL and in the parameters for BFV scheme there was a helper function for choosing the PolyModulus and CoeffModulus and however this was not provided for choosing the PlainModulus other than it should be either a prime or a power of 2 is there any way to know which optimal value to use?
In the given example the PlainModulus was set to parms.PlainModulus = new SmallModulus(256); Is there any special reason for choosing the value 256?
In BFV, the plain_modulus basically determines the size of your data type, just like in normal programming when you use 32-bit or 64-bit integers. When using BatchEncoder the data type applies to each slot in the plaintext vectors.
How you choose plain_modulus matters a lot: the noise budget consumption in multiplications is proportional to log(plain_modulus), so there are good reasons to keep it as small as possible. On the other hand, you'll need to ensure that you don't get into overflow situations during your computations, where your encrypted numbers exceed plain_modulus, unless you specifically only care about correctness of the results modulo plain_modulus.
In almost all real use-cases of BFV you should want to use BatchEncoder to not waste plaintext/ciphertext polynomial space, and this requires plain_modulus to be a prime. Therefore, you'll probably want it to be a prime, except in some toy examples.
I have a program which calculates probability values
(p-values),
but it is entering a very large negative number into the
exp function
exp(-626294.830) which evaluates to zero instead of the very small
positive number that it should be.
How can I get this to evaluate as a very small floating point number?
I have tried
Math::BigFloat,
bignum, and
bigrat
but all have failed.
Wolfram Alpha says that exp(-626294.830) is 4.08589×10^-271997... zero is a pretty close approximation to that ;-) Although you've edited and removed the context from your question, do you really need to work with such tiny numbers, or perhaps there is some way you could optimize your algorithm or scale your numbers?
Anyway, you are correct that code like Math::BigFloat->new("-626294.830")->bexp seems to take quite some time, even with the support of use Math::BigFloat lib => 'GMP';.
The only alternative I can offer at the moment is Math::Prime::Util::GMP's expreal, although you need to specify a precision to it.
use Math::Prime::Util::GMP qw/expreal/;
use Math::BigFloat;
my $e = Math::BigFloat->new(expreal(-626294.830,272000));
print $e->bnstr,"\n";
__END__
4.086e-271997
But on my machine, even that still takes ~20s to run, which brings us back to the question of potential optimization in other places.
Floating point numbers do not have infinite precision. Assuming the number is represented as an IEEE 754 double, we have 52 bits for a fraction, 11 bits for the exponent, and one bit for the sign. Due to the way exponents are encoded, the smallest positive number that can be represented is 2^-1022.
If we look at your number e^-626294.830, we can do a change of base and see that it equals 2^(log_2 e · -626294.830) = 2^-903552.445, which is significantly smaller than 2^-1022. Approximating your number as zero is therefore correct.
Instead of calculating this value using arbitrary-precision numerics, you are likely better off solving the necessary equations by hand, then coding this in a way that does not require extreme precision. For example, it is unlikely that you need the exact value of e^-626294.830, but perhaps just the magnitude. Then, you can calculate the logarithm instead of using exp().
I want to generate a large number of random numbers (uniformly distributed on the interval [0,1]). Currently the generation of these random numbers is causing my program to run quite slowly, however the program only needs them to be calculated to around 5 decimal places.
I'm not entirely sure of how MATLAB generates random numbers, but if there is a way of only calculating them to 5 decimal places then it will (hopefully greatly) speed up my program.
Is there a way of doing such a thing?
Thanks very much.
To answer your question, yes, you can generate single precision random numbers, like this:
r = rand(..., 'single'); %Reference: http://www.mathworks.com/help/matlab/ref/rand.html
Single precision numbers have 7 (ish) significant figures when printed as decimal.
To echo some comments above, I don't think this will buy you much performance. The first thing to do if rand is really your slow operation is to batch the calls. That is, instead of:
for ix 1:1000
y = rand(1,1,'single);
end
use:
yVector = rand(1000,1,'single');
As already mentioned, you can instruct RAND to generate numbers directly as single precision, and it's definitely best to generate the numbers in a decent sized chunk. If you still need more performance, and you have Parallel Computing Toolbox and a supported NVIDIA GPU, the gpuArray.rand function can be even faster, especially if you select the philox generator like so:
parallel.gpu.RandStream('Philox4x32-10')
Assuming you actually have a proper code layout where you generate a lot of numbers in an array, this can be a solution for low precision. Note that I have not tested but it is mentioned to be fast:
R = randi([0 100000],500,300)/100000
This will generate 150000 low precision random numbers between 0 and 1
I compute this simple sum on Matlab:
2*0.04-0.5*0.4^2 = -1.387778780781446e-017
but the result is not zero. What can I do?
Aabaz and Jim Clay have good explanations of what's going on.
It's often the case that, rather than exactly calculating the value of 2*0.04 - 0.5*0.4^2, what you really want is to check whether 2*0.04 and 0.5*0.4^2 differ by an amount that is small enough to be within the relevant numerical precision. If that's the case, than rather than checking whether 2*0.04 - 0.5*0.4^2 == 0, you can check whether abs(2*0.04 - 0.5*0.4^2) < thresh. Here thresh can either be some arbitrary smallish number, or an expression involving eps, which gives the precision of the numerical type you're working with.
EDIT:
Thanks to Jim and Tal for suggested improvement. Altered to compare the absolute value of the difference to a threshold, rather than the difference.
Matlab uses double-precision floating-point numbers to store real numbers. These are numbers of the form m*2^e where m is an integer between 2^52 and 2^53 (the mantissa) and e is the exponent. Let's call a number a floating-point number if it is of this form.
All numbers used in calculations must be floating-point numbers. Often, this can be done exactly, as with 2 and 0.5 in your expression. But for other numbers, most notably most numbers with digits after the decimal point, this is not possible, and an approximation has to be used. What happens in this case is that the number is rounded to the nearest floating-point number.
So, whenever you write something like 0.04 in Matlab, you're really saying "Get me the floating-point number that is closest to 0.04. In your expression, there are 2 numbers that need to be approximated: 0.04 and 0.4.
In addition, the exact result of operations like addition and multiplication on floating-point numbers may not be a floating-point number. Although it is always of the form m*2^e the mantissa may be too large. So you get an additional error from rounding the results of operations.
At the end of the day, a simple expression like yours will be off by about 2^-52 times the size of the operands, or about 10^-17.
In summary: the reason your expression does not evaluate to zero is two-fold:
Some of the numbers you start out with are different (approximations) to the exact numbers you provided.
The intermediate results may also be approximations of the exact results.
What you are seeing is quantization error. Matlab uses doubles to represent numbers, and while they are capable of a lot of precision, they still cannot represent all real numbers because there are an infinite number of real numbers. I'm not sure about Aabaz's trick, but in general I would say there isn't anything you can do, other than perhaps massaging your inputs to be double-friendly numbers.
I do not know if it is applicable to your problem but often the simplest solution is to scale your data.
For example:
a=0.04;
b=0.2;
a-0.2*b
ans=-6.9389e-018
c=a/min(abs([a b]));
d=b/min(abs([a b]));
c-0.2*d
ans=0
EDIT: of course I did not mean to give a universal solution to these kind of problems but it is still a good practice that can make you avoid a few problems in numerical computation (curve fitting, etc ...). See Jim Clay's answer for the reason why you are experiencing these problems.
I'm pretty sure this is a case of ye olde floating point accuracy issues.
Do you need 1e-17 accuracy? Is this merely a case of wanting 'pretty' output?
In that case, you can just use a formatted sprintf to display the number of significant digits you want.
Realize that this is not a matlab problem, but a fundamental limitation of how numbers are represented in binary.
For fun, work out what .1 is in binary...
Some references:
http://en.wikipedia.org/wiki/Floating_point#Accuracy_problems
http://www.mathworks.com/support/tech-notes/1100/1108.html
I have a program where I deal with a lot of very small numbers (towards the lower end of the Double limits).
During the execution of my application, some of these numbers progressively get smaller meaning their "estimation" is less accurate.
My solution at the moment is scaling them up before I do any calculations and then scaling them back down again?
...but it's got me thinking, am I actually gaining any more "accuracy" by doing this?
Thoughts?
Are your numbers really in the region between 10^-308 (smallest normalized double) and 10^-324 (smallest representable double, denormalized i.e. losing precision)? If so, then by scaling them up you do indeed gain accuracy by working around the limits of the exponent range of the double type.
I have to wonder though: what kind of application deals with numbers that extremely small? I know of no physical discipline that needs anything like that.
A double has a fixed number of significant digits, and another fixed number of bytes to represent the "power"-part.
In fact you may, therefore, have two issues:
Regarding the power-part: that is what approaching the limit of small doubles is about.
Scaling them up (by powers of 2) helps avoid that your number becomes no longer representable.
when you write about the the accuracy of "estimation", I assume you refer to the number of significant digits: that is not related to the small-number-limit. A number that is very small, but not too small in the sense of the lower limit for doubles, has the same number of significant digits as any "more normal" number.
Concerns about numerical precision of a number should, generally speaking, focus on how the number is computed, rather than on the absolute size of the result.