Integer division in Scala [duplicate] - scala

(note: not the same as this other question since the OP never explicitly specified rounding towards 0 or -Infinity)
JLS 15.17.2 says that integer division rounds towards zero. If I want floor()-like behavior for positive divisors (I don't care about the behavior for negative divisors), what's the simplest way to achieve this that is numerically correct for all inputs?
int ifloor(int n, int d)
{
/* returns q such that n = d*q + r where 0 <= r < d
* for all integer n, d where d > 0
*
* d = 0 should have the same behavior as `n/d`
*
* nice-to-have behaviors for d < 0:
* option (a). same as above:
* returns q such that n = d*q + r where 0 <= r < -d
* option (b). rounds towards +infinity:
* returns q such that n = d*q + r where d < r <= 0
*/
}
long lfloor(long n, long d)
{
/* same behavior as ifloor, except for long integers */
}
(update: I want to have a solution both for int and long arithmetic.)

If you can use third-party libraries, Guava has this: IntMath.divide(int, int, RoundingMode.FLOOR) and LongMath.divide(int, int, RoundingMode.FLOOR). (Disclosure: I contribute to Guava.)
If you don't want to use a third-party library for this, you can still look at the implementation.

(I'm doing everything for longs since the answer for ints is the same, just substitute int for every long and Integer for every Long.)
You could just Math.floor a double division result, otherwise...
Original answer:
return n/d - ( ( n % d != 0 ) && ( (n<0) ^ (d<0) ) ? 1 : 0 );
Optimized answer:
public static long lfloordiv( long n, long d ) {
long q = n/d;
if( q*d == n ) return q;
return q - ((n^d) >>> (Long.SIZE-1));
}
(For completeness, using a BigDecimal with a ROUND_FLOOR rounding mode is also an option.)
New edit: Now I'm just trying to see how far it can be optimized for fun. Using Mark's answer the best I have so far is:
public static long lfloordiv2( long n, long d ){
if( d >= 0 ){
n = -n;
d = -d;
}
long tweak = (n >>> (Long.SIZE-1) ) - 1;
return (n + tweak) / d + tweak;
}
(Uses cheaper operations than the above, but slightly longer bytecode (29 vs. 26)).

There's a rather neat formula for this that works when n < 0 and d > 0: take the bitwise complement of n, do the division, and then take the bitwise complement of the result.
int ifloordiv(int n, int d)
{
if (n >= 0)
return n / d;
else
return ~(~n / d);
}
For the remainder, a similar construction works (compatible with ifloordiv in the sense that the usual invariant ifloordiv(n, d) * d + ifloormod(n, d) == n is satisfied) giving a result that's always in the range [0, d).
int ifloormod(int n, int d)
{
if (n >= 0)
return n % d;
else
return d + ~(~n % d);
}
For negative divisors, the formulas aren't quite so neat. Here are expanded versions of ifloordiv and ifloormod that follow your 'nice-to-have' behavior option (b) for negative divisors.
int ifloordiv(int n, int d)
{
if (d >= 0)
return n >= 0 ? n / d : ~(~n / d);
else
return n <= 0 ? n / d : (n - 1) / d - 1;
}
int ifloormod(int n, int d)
{
if (d >= 0)
return n >= 0 ? n % d : d + ~(~n % d);
else
return n <= 0 ? n % d : d + 1 + (n - 1) % d;
}
For d < 0, there's an unavoidable problem case when d == -1 and n is Integer.MIN_VALUE, since then the mathematical result overflows the type. In that case, the formula above returns the wrapped result, just as the usual Java division does. As far as I'm aware, this is the only corner case where we silently get 'wrong' results.

return BigDecimal.valueOf(n).divide(BigDecimal.valueOf(d), RoundingMode.FLOOR).longValue();

Related

Hash function that returns the same hash for a sum even if different terms lead to the same sum

let's say I have:
n = 14
n is the result of the following sums of integers:
[5, 2, 7] -> 5 + 2 + 7 = 14 = n
[3, 4, 5, 2] -> 3 + 4 + 5 + 2 = 14 = n
[1, 13] -> 1 + 13 = 14 = n
[13, 1] -> 13 + 1 = 14 = n
[4, 3, 5, 2] -> 4 + 3 + 5 + 2 = 14 = n
...
I would need a hash function h so that:
h([5, 2, 7]) = h([3, 4, 5, 2]) = h([1, 13]) = h([13, 1]) = h([4, 3, 5, 2]) = h(...)
I.e. it doesn't matter the order of the integer terms and as long as their integer sum is the same, their hash should also the same.
I need to do this without computing the sum n, because the terms as well as n can be very high and easily overflow (they don't fit the bits of an int), that's why I am asking this question.
Are you aware or maybe do you have an insight on how I can implement such a hash function?
Given a list/sequence of integers, this hash function must return the same hash if the sum of the integers would be the same, but without computing the sum.
Thank you for your attention.
EDIT: I elaborated on #derpirscher's answer and modified his function a bit further as I had collisions on multiples of BIG_PRIME (this example is in JavaScript):
function hash(seq) {
const BIG_PRIME = 999999999989;
const MAX_SAFE_INTEGER_DIV_2_FLOOR = Math.floor(Number.MAX_SAFE_INTEGER / 2);
let h = 0;
for (i = 0; i < seq.length; i++) {
let value = seq[i];
if (h > MAX_SAFE_INTEGER_DIV_2_FLOOR) {
h = h % BIG_PRIME;
}
if (value > MAX_SAFE_INTEGER_DIV_2_FLOOR) {
value = value % BIG_PRIME;
}
h += value;
}
return h;
}
My question now would be: what do you think about this function? Are there some edge cases I didn't take into account?
Thank you.
EDIT 2:
Using the above function hash([1,2]); and hash([4504 * BIG_PRIME +1, 4504 * BIG_PRIME + 2]) will collide as mentioned by #derpirscher.
Here is another modified of version of the above function, which computes the modulo % BIG_PRIME only to one of the two terms if either of the two are greater than MAX_SAFE_INTEGER_DIV_2_FLOOR:
function hash(seq) {
const BIG_PRIME = 999999999989;
const MAX_SAFE_INTEGER_DIV_2_FLOOR = Math.floor(Number.MAX_SAFE_INTEGER / 2);
let h = 0;
for (let i = 0; i < seq.length; i++) {
let value = seq[i];
if (
h > MAX_SAFE_INTEGER_DIV_2_FLOOR &&
value > MAX_SAFE_INTEGER_DIV_2_FLOOR
) {
if (h > MAX_SAFE_INTEGER_DIV_2_FLOOR) {
h = h % BIG_PRIME;
} else if (value > MAX_SAFE_INTEGER_DIV_2_FLOOR) {
value = value % BIG_PRIME;
}
}
h += value;
}
return h;
}
I think this version lowers the number of collisions a bit further.
What do you think? Thank you.
EDIT 3:
Even though I tried to elaborate on #derpirscher's answer, his implementation of hash is the correct one and the one to use.
Use his version if you need such an hash function.
You could calculate the sum modulo some big prime. If you want to stay within the range of int, you need to know what the maximum integer is, in the language you are using. Then select a BIG_PRIME that's just below maxint / 2
Assuming an int to be 4 bytes, maxint = 2147483647 thus the biggest prime < maxint/2 would be 1073741789;
int hash(int[] seq) {
BIG_PRIME = 1073741789;
int h = 0;
for (int i = 0; i < seq.Length; i++) {
h = (h + seq[i] % BIG_PRIME) % BIG_PRIME;
}
return h;
}
As at every step both summands will always be below maxint/2 you won't get any overflows.
Edit
From a mathematical point of view, the following property which may be important for your use case holds:
(a + b + c + ...) % N == (a % N + b % N + c % N + ...) % N
But yeah, of course, as in every hash function you will have collisions. You can't have a hash function without collisions, because the size of the domain of the hash function (ie the number of possible input values) is generally much bigger than the the size of the codomain (ie the number of possible output values).
For your example the size of the domain is (in principle) infinite, as you can have any count of numbers from 1 to 2000000000 in your sequence. But your codomain is just ~2000000000 elements (ie the range of int)

Fastest way to get the first prime numbers up to a limit

What is the fastest way in terms of CPU to get the first prime numbers up to a limit?
It takes 4.8 s on my machine to calculate the first 1,000,000,000 numbers. It's even faster than reading it from a file!
public static void main(String[] args) {
long startingTime = System.nanoTime();
int limit = 1_000_000_000;
BitSet bitSet = findPrimes(limit);
System.out.println(2);
System.out.println(3);
System.out.println(5);
System.out.println(7);
limit = limit / 2 + 1;
for (int i = 6; i < limit; i++) {
if (!bitSet.get(i)) {
int p = i * 2 - 1;
//System.out.println(p);
}
}
System.out.println("Done in " + (System.nanoTime() - startingTime) / 1_000_000 + " milliseconds");
}
public static BitSet findPrimes(int limit) {
BitSet bitSet = new BitSet(limit / 2 + 1);
int size = (int) Math.sqrt(limit);
size += size % 2;
for (int prime = 3; prime < size; prime += 2) {
if (bitSet.get(prime / 2 + 1)) {
continue;
}
for (int i = prime / 2 + 1; i < bitSet.size(); i += prime) {
bitSet.set(i);
}
}
return bitSet;
}
The most efficient way of computing prime values is to use a Sieve of Eratosthenes, first discovered in the ancient greek millenniums ago.
The problem, however, is that it requires a lot of memory. My Java app is simply crashing when I just declare a boolean array of 1,000,000,000.
So I made two memory optimizations to make it work
Since all the primes after 2 are odd numbers, I half the array size by mapping the indexes to int p = i * 2 - 1;
Then instead of using an array of booleans, I used a BitSet that works with bit operation on an array of longs.
With those two optimizations using the Sieve of Eratosthenes allows you to compute the btSet in 4.8 seconds. As for printing it out, it's another story ;)
There are a number of ways to quickly calculate primes -- the 'fastest' ultimately will depend on many factors (combination of hardware, algorithms, data format, etc). This has already been answered here: Fastest way to list all primes below N . However, that is for Python, and the question becomes more interesting when we consider C (or other lower level languages)
You can see:
https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
https://en.wikipedia.org/wiki/Sieve_of_Atkin
Asymptotically, Sieve of Atkin has a better computational complexity (O(N) or lower* -- see notes), but in practice the Sieve of Eratosthenes (O(N log log N)) is perfectly fine, and it has the advantage of being extremely easy to implement:
/* Calculate 'isprime[i]' to tell whether 'i' is prime, for all 0 <= i < N */
void primes(int N, bool* isprime) {
int i, j;
/* first, set odd numbers to prime, evens to not-prime */
for (i = 0; i < N; ++i) isprime[i] = i % 2 == 1;
/* special cases for i<3 */
isprime[0] = isprime[1] = false;
/* Compute square root via newton's algorithm
* (this can be replaced by 'i * i <= N' in the foor loop)
*/
int sqrtN = (N >> 10) + 1024;
for (i = 0; i < 32; ++i) sqrtN = (N / sqrtN + sqrtN) / 2;
/* iterate through all odd numbers */
isprime[2] = true;
for (i = 3; i <= sqrtN /* or: i*i <= N */; i += 2) {
/* check if this odd number is still prime (i.e. if it has not been marked off) */
if (isprime[i]) {
/* mark off multiples of the prime */
j = 2 * i;
isprime[j] = false;
for (j += i; j < N; j += 2 * i) {
isprime[j] = false;
}
}
}
}
That code is pretty clear and consise, but takes around ~6s to compute primes < 1_000_000_000 (1 billion), and uses N * sizeof(bool) == N == 1000000000 bytes of memory. Not great
We can use bit-fiddling and uint64_t type (use #include <stdint.h>) to produce code that runs in the same complexity, but hopefully much faster and using less memory. We can right off the bat reduce the memory to N/8 (1 bit per boolean instead of 1 byte). Additionally, we can only count odd numbers and shave even more memory off -- a factor of 2, taking our final usage to N/16 bytes.
Here's the code:
/* Calculate that the (i//2)%64'th bit of 'isprime[(i//2)/64]' to tell whether 'i' is prime, for all 0 <= i < N,
* with 'i % 2 == 0'
*
* Since the array 'isprime' can be seen as a long bistring:
*
* isprime:
* [0] [1] ...
* +---------+---------+
* |01234....|01234....| ...
* +---------+---------+
*
* Where each block is a 64 bit integer. Therefore, we can map the odd natural numbers to this with the following formula:
* i -> index=((i / 2) / 64), bit=((i / 2) % 64)
*
* So, '1' is located at 'primes[0][0]' (where the second index represnts the bit),
* '3' is at 'primes[0][1]', 5 at 'primes[0][2]', and finally, 129 is at 'primes[1][0]'
*
* And so forth.
*
* Then, we use that value as a boolean to indicate whether the 'i' that maps to it is prime
*
*/
void primes_bs(int N, uint64_t* isprime) {
uint64_t i, j, b/* block */, c/* bit */, v, ii;
/* length of the array is roundup(N / 128) */
int len = (N + 127) / 128;
/* set all to isprime */
for (i = 0; i < len; ++i) isprime[i] = ~0;
/* set i==1 to not prime */
isprime[0] &= ~1ULL;
/* Compute square root via newton's algorithm */
int sqrtN = (N >> 10) + 1024;
for (i = 0; i < 32; ++i) sqrtN = (N / sqrtN + sqrtN) / 2;
/* Iterate through every word/chunk and handle its bits */
uint64_t chunk;
for (b = 0; b <= (sqrtN + 127) / 128; ++b) {
chunk = isprime[b];
c = 0;
while (chunk && c < 64) {
if (chunk & 1ULL) {
/* hot bit, so is prime */
i = 128 * b + 2 * c + 1;
/* iterate 3i, 5i, 7i, etc...
* BUT, j is the index, so is basically 3i/2, 5i/2, 7i/2,
* that way we don't need as many divisions per iteration,
* and we can use 'j += i' (since the index only needs 'i'
* added to it, whereas the value needs '2i')
*/
for (j = 3 * i / 2; j < N / 2; j += i) {
isprime[j / 64] &= ~(1ULL << (j % 64));
}
}
/* chunk will not modify itself */
c++;
chunk = isprime[b] >> c;
}
}
/* set extra bits to 0 -- unused */
if (N % 128 != 0) isprime[len - 1] &= (1ULL << ((N / 2) % 64)) - 1;
}
This code calculates primality for all numbers less than 1000000000 (1_000_000_000) in ~2.7sec on my machine, which is a great improvement! And you use much less memory (1/16 th of what the other method uses.
However, if you're using it in other code, the interface is trickier -- you can use a function to help:
/* Calculate if 'x' is prime from a dense bitset */
bool calc_is_prime(uint64_t* isprime, uint64_t x) {
/**/ if (x == 2) return true; /* must be included, since 'isprime' only encodes odd numbers */
else if (x % 2 == 0) return false;
return (isprime[(x / 2) / 64] >> ((x / 2) % 64))) & 1ULL;
}
You could go even further, and generalize this method of mapping indices for odd-only values (https://en.wikipedia.org/wiki/Wheel_factorization), but the resulting code may well be slower -- since the inner loops and operations are no longer powers of 2, you either have to have a table look up, or some very clever trickery. The only case when you would want to use wheel factorization is if you were very memory constrained, in which case wheel factoring for primorial(3) == 30, would allow you to store 30 numbers for every 7 bits (as opposed to 2 numbers for every 1 bit). Or, say primorial(5) == 2310, you could store 2310 numbers for every 341
However, large wheel factorizations will, again, change some of the instructions from bit shifts and power-of-two operations to table lookups, and for large wheels, this could actually end up being much slower
NOTES:
Technically, versions of the Sieve of Atkin exist which is O(N / (log log N)), but you no longer use the same dense array (if it was, the complexity MUST be at least O(N))

How to perform Modulo greatest common divisor?

Suppose that gcd(e,m) = g. Find integer d such that (e x d) = g mod m
Where m and e are greater than or equal to 1.
The following problem seems to be solvable algebraically but I've tried doing it and it give me an integer number. Sometimes, the solution for d is an integer and sometimes it isn't. How can I approach this problem?
d can be computed with the extended euklidean algorithm, see e.g. here:
https://en.wikipedia.org/wiki/Extended_Euclidean_algorithm
The a,b on that page are your e,m, and your d will be the x.
Perhaps you are assuming that both e and m are integers, but the problem allows them to be non-integers? There is only one case that gives an integer solution when both e and m are integers.
Why strictly integer output is not a reasonable outcome if e != m:
When you look at a fraction like 3/7 say, and refer to its denominator as the numerator's "divisor", this is a loose sense of the word from a classical math-y perspective. When you talk about the gcd (greatest common divisor), the "d" refers to an integer that divides the numerator (an integer) evenly, resulting in another integer: 4 is a divisor of 8, because 8/4 = 2 and 2 is an integer. A computer science or discrete mathematics perspective might frame a divisor as a number d that for a given number a gives 0 when we take a % d (a mod d for discrete math). Can you see that the absolute value of a divisor can't exceed the absolute value of the numerator? If it did, you would get pieces of pie, instead of whole pies - example:
4 % a = 0 for a in Z (Z being the set of integers) while |a| <= 4 (in math-y notation, that set is: {a ∈ Z : |a| <= 4}), but
4 % a != 0 for a in Z while |a| > 4 (math-y: {a ∈ Z : |a| > 4}
), because when we divide 4 by stuff bigger than it, like 5, we get fractions (i.e. |4/a| < 1 when |a| > 4). Don't worry too much about the absolute value stuff if it throws you off - it is there to account for working with negative numbers since they are integers as well.
So, even the "greatest" of divisors for any given integer will be smaller than the integer. Otherwise it's not a divisor (see above, or Wikipedia on divisors).
Look at gcd(e, m) = g:
By the definition of % (mod for math people), for any two numbers number1 and number2, number1 % number2 never makes number1 bigger: number1 % number2 <= number1.
So substitute: (e * d) = g % m --> (e * d) <= g
By the paragraphs above and definition of gcd being a divisor of both e and m: g <= e, m.
To make (e * d) <= g such that d, g are both integers, knowing that g <= e since g is a divisor of e, we have to make the left side smaller to match g. You can only make an integer smaller with multiplcation if the other multipland is 0 or a fraction. The problem specifies that d is an integer, so we one case that works - the d = 0 case - and infinitely many that give a contradiction - contradiction that e, m, and d all be integers.
If e == m:
This is the d = 0 case:
If e == m, then gcd(e, m) = e = m - example: greatest common divisor of 3 and 3 is 3
Then (e * d) = g % m is (e * d) = m % m and m % m = 0 so (e * d) = 0 implying d = 0
How to code a function that will find d when either of e or m might be NON-integer:
A lot of divisor problems are done iteratively, like "find the gcd" or "find a prime number". That works in part because those problems deal strictly with integers, which are countable. With this problem, we need to allow e or m to be non-integer in order to have a solution for cases other than e = m. The set of rational numbers is NOT countable, however, so an iterative solution would eventually make your program crash. With this problem, you really just want a formula, and possibly some cases. You might set it up like this:
If e == m
return 0 # since (e * d) = m % m -> d = 0
Else
return g / e
Lastly:
Another thing that might be useful depending on what you do with this problem is the fact that the right-hand-side is always either g or 0, because g <= m since g is a divisor of m (see all the stuff above). In the cases where g < m, g % m = g. In the case where g == m, g % m = 0.
The #asp answer with the link to the Wikipedia page on the Euclidean Algorithm is good.
The #aidenhjj comment about trying the math-specific version of StackOverflow is good.
In case this is for a math class and you aren't used to coding: <=, >=, ==, and != are computer speak for ≤, ≥, "are equal", and "not equal" respectively.
Good luck.

Best Way to Add 3 Numbers (or 4, or N) in Java - Kahan Sums?

I found a completely different answer to this question, the whole original question makes no sense anymore. However, the answer way be useful, so I modify it a bit...
I want to sum up three double numbers, say a, b, and c, in the most numerically stable way possible.
I think using a Kahan Sum would be the way to go.
However, a strange thought occured to me: Would it make sense to:
First sum up a, b, and c and remember the (absolute value of the) compensation.
Then sum up a, c, b
If the (absolute value of the) compensation of the second sum is smaller, use this sum instead.
Proceed similar with b, a, c and other permutations of the numbers.
Return the sum with the smallest associated absolute compensation.
Would I get a more "stable" Addition of three numbers this way? Or does the order of numbers in the sum have no (use-able) impact on the compensation left at the end of the Summation? With (use-able) I mean to ask whether the compensation value itself is stable enough to contain Information that I can use?
(I am using the Java programming language, although I think this does not matter here.)
Many thanks,
Thomas.
I think I found a much more reliable way to solve the "Add 3" (or "Add 4" or "Add N" numbers problem.
First of all, I implemented my idea from the original post. It resulted into quite some big code which seemed, initially, to work. However, it failed in the following case: add Double.MAX_VALUE, 1, and -Double.MAX_VALUE. The result was 0.
#njuffa's comments inspired me dig somewhat deeper and at http://code.activestate.com/recipes/393090-binary-floating-point-summation-accurate-to-full-p/, I found that in Python, this problem has been solved quite nicely. To see the full code, I downloaded the Python source (Python 3.5.1rc1 - 2015-11-23) from https://www.python.org/getit/source/, where we can find the following method (under PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2):
static PyObject*
math_fsum(PyObject *self, PyObject *seq)
{
PyObject *item, *iter, *sum = NULL;
Py_ssize_t i, j, n = 0, m = NUM_PARTIALS;
double x, y, t, ps[NUM_PARTIALS], *p = ps;
double xsave, special_sum = 0.0, inf_sum = 0.0;
volatile double hi, yr, lo;
iter = PyObject_GetIter(seq);
if (iter == NULL)
return NULL;
PyFPE_START_PROTECT("fsum", Py_DECREF(iter); return NULL)
for(;;) { /* for x in iterable */
assert(0 <= n && n <= m);
assert((m == NUM_PARTIALS && p == ps) ||
(m > NUM_PARTIALS && p != NULL));
item = PyIter_Next(iter);
if (item == NULL) {
if (PyErr_Occurred())
goto _fsum_error;
break;
}
x = PyFloat_AsDouble(item);
Py_DECREF(item);
if (PyErr_Occurred())
goto _fsum_error;
xsave = x;
for (i = j = 0; j < n; j++) { /* for y in partials */
y = p[j];
if (fabs(x) < fabs(y)) {
t = x; x = y; y = t;
}
hi = x + y;
yr = hi - x;
lo = y - yr;
if (lo != 0.0)
p[i++] = lo;
x = hi;
}
n = i; /* ps[i:] = [x] */
if (x != 0.0) {
if (! Py_IS_FINITE(x)) {
/* a nonfinite x could arise either as
a result of intermediate overflow, or
as a result of a nan or inf in the
summands */
if (Py_IS_FINITE(xsave)) {
PyErr_SetString(PyExc_OverflowError,
"intermediate overflow in fsum");
goto _fsum_error;
}
if (Py_IS_INFINITY(xsave))
inf_sum += xsave;
special_sum += xsave;
/* reset partials */
n = 0;
}
else if (n >= m && _fsum_realloc(&p, n, ps, &m))
goto _fsum_error;
else
p[n++] = x;
}
}
if (special_sum != 0.0) {
if (Py_IS_NAN(inf_sum))
PyErr_SetString(PyExc_ValueError,
"-inf + inf in fsum");
else
sum = PyFloat_FromDouble(special_sum);
goto _fsum_error;
}
hi = 0.0;
if (n > 0) {
hi = p[--n];
/* sum_exact(ps, hi) from the top, stop when the sum becomes
inexact. */
while (n > 0) {
x = hi;
y = p[--n];
assert(fabs(y) < fabs(x));
hi = x + y;
yr = hi - x;
lo = y - yr;
if (lo != 0.0)
break;
}
/* Make half-even rounding work across multiple partials.
Needed so that sum([1e-16, 1, 1e16]) will round-up the last
digit to two instead of down to zero (the 1e-16 makes the 1
slightly closer to two). With a potential 1 ULP rounding
error fixed-up, math.fsum() can guarantee commutativity. */
if (n > 0 && ((lo < 0.0 && p[n-1] < 0.0) ||
(lo > 0.0 && p[n-1] > 0.0))) {
y = lo * 2.0;
x = hi + y;
yr = x - hi;
if (y == yr)
hi = x;
}
}
sum = PyFloat_FromDouble(hi);
_fsum_error:
PyFPE_END_PROTECT(hi)
Py_DECREF(iter);
if (p != ps)
PyMem_Free(p);
return sum;
}
This summation method is different from Kahan's method, it uses a variable number of compensation variables. When adding the ith number, at most i additional compensation variables (stored in the array p) get used. This means if I want to add 3 numbers, I may need 3 additional variables. For 4 numbers, I may need 4 additional variables. Since the number of used variables may increase from n to n+1 only after the nth summand is loaded, I can translate the above code to Java as follows:
/**
* Compute the exact sum of the values in the given array
* {#code summands} while destroying the contents of said array.
*
* #param summands
* the summand array – will be summed up and destroyed
* #return the accurate sum of the elements of {#code summands}
*/
private static final double __destructiveSum(final double[] summands) {
int i, j, n;
double x, y, t, xsave, hi, yr, lo;
boolean ninf, pinf;
n = 0;
lo = 0d;
ninf = pinf = false;
for (double summand : summands) {
xsave = summand;
for (i = j = 0; j < n; j++) {
y = summands[j];
if (Math.abs(summand) < Math.abs(y)) {
t = summand;
summand = y;
y = t;
}
hi = summand + y;
yr = hi - summand;
lo = y - yr;
if (lo != 0.0) {
summands[i++] = lo;
}
summand = hi;
}
n = i; /* ps[i:] = [summand] */
if (summand != 0d) {
if ((summand > Double.NEGATIVE_INFINITY)
&& (summand < Double.POSITIVE_INFINITY)) {
summands[n++] = summand;// all finite, good, continue
} else {
if (xsave <= Double.NEGATIVE_INFINITY) {
if (pinf) {
return Double.NaN;
}
ninf = true;
} else {
if (xsave >= Double.POSITIVE_INFINITY) {
if (ninf) {
return Double.NaN;
}
pinf = true;
} else {
return Double.NaN;
}
}
n = 0;
}
}
}
if (pinf) {
return Double.POSITIVE_INFINITY;
}
if (ninf) {
return Double.NEGATIVE_INFINITY;
}
hi = 0d;
if (n > 0) {
hi = summands[--n];
/*
* sum_exact(ps, hi) from the top, stop when the sum becomes inexact.
*/
while (n > 0) {
x = hi;
y = summands[--n];
hi = x + y;
yr = hi - x;
lo = y - yr;
if (lo != 0d) {
break;
}
}
/*
* Make half-even rounding work across multiple partials. Needed so
* that sum([1e-16, 1, 1e16]) will round-up the last digit to two
* instead of down to zero (the 1e-16 makes the 1 slightly closer to
* two). With a potential 1 ULP rounding error fixed-up, math.fsum()
* can guarantee commutativity.
*/
if ((n > 0) && (((lo < 0d) && (summands[n - 1] < 0d)) || //
((lo > 0d) && (summands[n - 1] > 0d)))) {
y = lo * 2d;
x = hi + y;
yr = x - hi;
if (y == yr) {
hi = x;
}
}
}
return hi;
}
This function will take the array summands and add up the elements while simultaneously using it to store the compensation variables. Since we load the summand at index i before the array element at said index may become used for compensation, this will work.
Since the array will be small if the number of variables to add is small and won't escape the scope of our method, I think there is a decent chance that it will be allocated directly on the stack by the JIT, which may make the code quite fast.
I admit that I did not fully understand why the authors of the original code handled infinities, overflows, and NaNs the way they did. Here my code deviates from the original. (I hope I did not mess it up.)
Either way, I can now sum up 3, 4, or n double numbers by doing:
public static final double add3(final double x0, final double x1,
final double x2) {
return __destructiveSum(new double[] { x0, x1, x2 });
}
public static final double add4(final double x0, final double x1,
final double x2, final double x3) {
return __destructiveSum(new double[] { x0, x1, x2, x3 });
}
If I want to sum up 3 or 4 long numbers and obtain the precise result as double, I will have to deal with the fact that doubles can only represent longs in -9007199254740992..9007199254740992L. But this can easily be done by splitting each long into two parts:
public static final long add3(final long x0, final long x1,
final long x2) {
double lx;
return __destructiveSum(new long[] {new double[] { //
lx = x0, //
(x0 - ((long) lx)), //
lx = x1, //
(x1 - ((long) lx)), //
lx = x2, //
(x2 - ((long) lx)), //
});
}
public static final long add4(final long x0, final long x1,
final long x2, final long x3) {
double lx;
return __destructiveSum(new long[] {new double[] { //
lx = x0, //
(x0 - ((long) lx)), //
lx = x1, //
(x1 - ((long) lx)), //
lx = x2, //
(x2 - ((long) lx)), //
lx = x3, //
(x3 - ((long) lx)), //
});
}
I think this should be about right. At least I can now add Double.MAX_VALUE, 1, and -Double.MAX_VALUE and get 1 as result.

Octave - how to operate with big numbers

I work on RSA algorithm in octave, but it isn't working in proper way. Problem appears while i try to use "^" function. Check my example below:
>> mod((80^65), 133)
terminal gives me:
ans = 0
I cannot fix this stuff, it's funny becouse even my system calculator return correct number (54)
to calculate this in correct way you can use fast power-modulo algorithm.
In c++, check function below where ->
a^b mod m:
int power_modulo_fast(int a, int b, int m)
{
int i;
int result = 1;
int x = a % m;
for (i=1; i<=b; i<<=1)
{
x %= m;
if ((b&i) != 0)
{
result *= x;
result %= m;
}
x *= x;
}
return result;
}