Maximum integer in Perl - perl

Set $i=0 and do ++$i while it increases. Which number we would reach?
Note that it may be not the same as maximum integer in Perl (as asked in the title), because there may be gaps between adjacent integers which are greater than 1.

"Integer" can refer to a family of data types (int16_t, uint32_t, etc). There's no gap in the numbers these can represent.
"Integer" can also refer to numbers without a fractional component, regardless of the type of the variable used to store it. ++ will seamlessly transition between data types, so this is what's relevant to this question.
Floating point numbers can store integers in this sense, and it's possible to store very large numbers as floats without being able to add one to them. The reason for this is that floating pointer numbers are stored using the following form:
[+/-]1._____..._____ * 2**____
For example, let's say the mantissa of your floats can store 52 bits after the decimal, and you want to add 1 to 2**53.
__52 bits__
/ \
1.00000...00000 * 2**53 Large power of two
+ 1.00000...00000 * 2**0 1
--------------------------
1.00000...00000 * 2**53
+ 0.00000...000001 * 2**53 Normalized exponents
--------------------------
1.00000...00000 * 2**53
+ 0.00000...00000 * 2**53 What we really get due to limited number of bits
--------------------------
1.00000...00000 * 2**53 Original large power of two
So it is possible to hit a gap when using floating point numbers. However, you started with a number stored as signed integer.
$ perl -MB=svref_2object,SVf_IVisUV,SVf_NOK -e'
$i = 0;
$sv = svref_2object(\$i);
print $sv->FLAGS & SVf_NOK ? "NV\n" # Float
: $sv->FLAGS & SVf_IVisUV ? "UV\n" # Unsigned int
: "IV\n"; # Signed int
'
IV
++$i will leave the number as a signed integer value ("IV") until it cannot anymore. At that point, it will start using an unsigned integer values ("UV").
$ perl -MConfig -MB=svref_2object,SVf_IVisUV,SVf_NOK -e'
$i = hex("7F".("FF"x($Config{ivsize}-2))."FD");
$sv = svref_2object(\$i);
for (1..4) {
++$i;
printf $sv->FLAGS & SVf_NOK ? "NV %.0f\n"
: $sv->FLAGS & SVf_IVisUV ? "UV %u\n"
: "IV %d\n", $i;
}
'
IV 2147483646
IV 2147483647 <-- 2**31 - 1 Largest IV
UV 2147483648
UV 2147483649
or
IV 9223372036854775806
IV 9223372036854775807 <-- 2**63 - 1 Largest IV
UV 9223372036854775808
UV 9223372036854775809
Still no gap because no floating point numbers have been used yet. But Perl will eventually use floating point numbers ("NV") because they have a far larger range than integers. ++$i will switch to using a floating point number when it runs out of unsigned integers.
When that happens depends on your build of Perl. Not all builds of Perl have the same integer and floating point number sizes.
On one machine:
$ perl -V:[in]vsize
ivsize='4'; # 32-bit integers
nvsize='8'; # 64-bit floats
On another:
$ perl -V:[in]vsize
ivsize='8'; # 64-bit integers
nvsize='8'; # 64-bit floats
On a system where nvsize is larger than ivsize
On these systems, the first gap will happen above the largest unsigned integer. If your system uses IEEE double-precision floats, your floats have 53-bit of precision. They can represent without loss all integers from -253 to 253 (inclusive). ++ will fail to increment beyond that.
$ perl -MConfig -MB=svref_2object,SVf_IVisUV,SVf_NOK -e'
$i = eval($Config{nv_overflows_integers_at}) - 3;
$sv = svref_2object(\$i);
for (1..4) {
++$i;
printf $sv->FLAGS & SVf_NOK ? "NV %.0f\n"
: $sv->FLAGS & SVf_IVisUV ? "UV %u\n"
: "IV %d\n", $i;
}
'
NV 9007199254740990
NV 9007199254740991
NV 9007199254740992 <-- 2**53 Requires 1 bit of precision as a float
NV 9007199254740992 <-- 2**53 + 1 Requires 54 bits of precision as a float
but only 53 are available.
On a system where nvsize is no larger than ivsize
On these systems, the first gap will happen before the largest unsigned integer. Switching to floating pointer numbers will allow you to go one further (a large power of two), but that's it. ++ will fail to increment beyond the largest unsigned integer + 1.
$ perl -MConfig -MB=svref_2object,SVf_IVisUV,SVf_NOK -e'
$i = hex(("FF"x($Config{ivsize}-1))."FD");
$sv = svref_2object(\$i);
for (1..4) {
++$i;
printf $sv->FLAGS & SVf_NOK ? "NV %.0f\n"
: $sv->FLAGS & SVf_IVisUV ? "UV %u\n"
: "IV %d\n", $i;
}
'
UV 18446744073709551614
UV 18446744073709551615 <-- 2**64 - 1 Largest UV
NV 18446744073709551616 <-- 2**64 Requires 1 bit of precision as a float
NV 18446744073709551616 <-- 2**64 + 1 Requires 65 bits of precision as a float
but only 53 are available.

This is on 32-bit perl,
perl -e "$x=2**53-5; printf qq{%.f\n}, ++$x for 1..10"
9007199254740988
9007199254740989
9007199254740990
9007199254740991
9007199254740992
9007199254740992
9007199254740992
9007199254740992
9007199254740992
9007199254740992

Well, on my 64-bit machine it's 18446744073709551615 (much easier as ~0), after which it increases once more time to 1.84467440737096e+19 and stops incrementing.

Related

How to isolate leftmost bytes in integer

This has to be done in Perl:
I have integers on the order of e.g. 30_146_890_129 and 17_181_116_691 and 21_478_705_663.
These are supposedly made up of 6 bytes, where:
bytes 0-1 : value a
bytes 2-3 : value b
bytes 4-5 : value c
I want to isolate what value a is. How can I do this in Perl?
I've tried using the >> operator:
perl -e '$a = 330971351478 >> 16; print "$a\n";'
5050222
perl -e '$a = 17181116691 >> 16; print "$a\n";'
262163
But these numbers are not on the order of what I am expecting, more like 0-1000.
Bonus if I can also get values b and c but I don't really need those.
Thanks!
number >> 16 returns number shifted by 16 bit and not the shifted bits as you seem to assume. To get the last 16 bit you might for example use number % 2**16 or number & 0xffff. To get to b and c you can just shift before getting the last 16 bits, i.e.
$a = $number & 0xffff;
$b = ($number >> 16) & 0xffff;
$c = ($number >> 32) & 0xffff;
If you have 6 bytes, you don't need to convert them to a number first. You can use one the following depending on the order of the bytes: (Uppercase represents the most significant byte.)
my ($num_c, $num_b, $num_a) = unpack('nnn', "\xCC\xcc\xBB\xbb\xAA\xaa");
my ($num_a, $num_b, $num_c) = unpack('nnn', "\xAA\xaa\xBB\xbb\xAA\xaa");
my ($num_c, $num_b, $num_a) = unpack('vvv', "\xcc\xCC\xbb\xBB\xaa\xAA");
my ($num_a, $num_b, $num_c) = unpack('vvv', "\xaa\xAA\xbb\xBB\xcc\xCC");
If you are indeed provided with a number 0xCCccBBbbAAaa), you can convert it to bytes then extract the numbers you want from it as follows:
my ($num_c, $num_b, $num_a) = unpack('xxnnn', pack('Q>', $num));
Alternatively, you could also use an arithmetic approach like you attempted.
my $num_a = $num & 0xFFFF;
my $num_b = ( $num >> 16 ) & 0xFFFF;
my $num_c = $num >> 32;
While the previous two solutions required a Perl built to use 64-bit integers, the following will work with any build of Perl:
my $num_a = $num % 2**16;
my $num_b = ( $num / 2**16 ) % 2**16;
my $num_c = int( $num / 2**32 );
Let's look at ( $num >> 16 ) & 0xFFFF in detail.
Original number: 0x0000CCccBBbbAAaa
After shifting: 0x00000000CCccBBbb
After masking: 0x000000000000BBbb

Expression for setting lowest n bits that works even when n equals word size

NB: the purpose of this question is to understand Perl's bitwise operators better. I know of ways to compute the number U described below.
Let $i be a nonnegative integer. I'm looking for a simple expression E<$i>1 that will evaluate to the unsigned int U, whose $i lowest bits are all 1's, and whose remaining bits are all 0's. E.g. E<8> should be 255. In particular, if $i equals the machine's word size (W), E<$i> should equal ~02.
The expressions (1 << $i) - 1 and ~(~0 << $i) both do the right thing, except when $i equals W, in which case they both take on the value 0, rather than ~0.
I'm looking for a way to do this that does not require computing W first.
EDIT: OK, I thought of an ugly, plodding solution
$i < 1 ? 0 : do { my $j = 1 << $i - 1; $j < $j << 1 ? ( $j << 1 ) - 1 : ~0 }
or
$i < 1 ? 0 : ( 1 << ( $i - 1 ) ) < ( 1 << $i ) ? ( 1 << $i ) - 1 : ~0
(Also impractical, of course.)
1 I'm using the strange notation E<$i> as shorthand for "expression based on $i".
2 I don't have a strong preference at the moment for what E<$i> should evaluate to when $i is strictly greater than W.
On systems where eval($Config{nv_overflows_integers_at}) >= 2**($Config{ptrsize*8}) (which excludes one that uses double-precision floats and 64-bit ints),
2**$i - 1
On all systems,
( int(2**$i) - 1 )|0
When i<W, int will convert the NV into an IV/UV, allowing the subtraction to work on systems with the precision of NVs is less than the size of UVs. |0 has no effect in this case.
When i≥W, int has no effect, so the subtraction has no effect. |0 therefore overflows, in which case Perl returns the largest integer.
I don't know how reliable that |0 behaviour is. It could be compiler-specific. Don't use this!
use Config qw( %Config );
$i >= $Config{uvsize}*8 ? ~0 : ~(~0 << $i)
Technically, the word size is looked up, not computed.
Fun challenge!
use Devel::Peek qw[Dump];
for my $n (8, 16, 32, 64) {
Dump(~(((1 << ($n - 1)) << 1) - 1) ^ ~0);
}
Output:
SV = IV(0x7ff60b835508) at 0x7ff60b835518
REFCNT = 1
FLAGS = (PADTMP,IOK,pIOK)
IV = 255
SV = IV(0x7ff60b835508) at 0x7ff60b835518
REFCNT = 1
FLAGS = (PADTMP,IOK,pIOK)
IV = 65535
SV = IV(0x7ff60b835508) at 0x7ff60b835518
REFCNT = 1
FLAGS = (PADTMP,IOK,pIOK)
IV = 4294967295
SV = IV(0x7ff60b835508) at 0x7ff60b835518
REFCNT = 1
FLAGS = (PADTMP,IOK,pIOK,IsUV)
UV = 18446744073709551615
Perl compiled with:
ivtype='long', ivsize=8, nvtype='double', nvsize=8
The documentation on the shift operators in perlop has an answer to your problem: use bigint;.
From the documentation:
Note that both << and >> in Perl are implemented directly using << and >> in C. If use integer (see Integer Arithmetic) is in force then signed C integers are used, else unsigned C integers are used. Either way, the implementation isn't going to generate results larger than the size of the integer type Perl was built with (32 bits or 64 bits).
The result of overflowing the range of the integers is undefined because it is undefined also in C. In other words, using 32-bit integers, 1 << 32 is undefined. Shifting by a negative number of bits is also undefined.
If you get tired of being subject to your platform's native integers, the use bigint pragma neatly sidesteps the issue altogether:
print 20 << 20; # 20971520
print 20 << 40; # 5120 on 32-bit machines,
# 21990232555520 on 64-bit machines
use bigint;
print 20 << 100; # 25353012004564588029934064107520

How do I determine the maximum range for perl's range iterator?

I can exceed perl's range iteration bounds like so, with or without -Mbigint:
$» perl -E 'say $^V; say for (0..shift)' 1e19
v5.16.2
Range iterator outside integer range at -e line 1.
How can I determine this upper limit, without simply trying until I exceed it?
It's an IV.
>> similarly works on integers, so you can use
my $max_iv = -1 >> 1;
my $min_iv = -(-1 >> 1) - 1;
They can also be derived from the size of an IV.
my $max_iv = (1 << ($iv_bits-1)) - 1;
my $min_iv = -(1 << ($iv_bits-1));
The size of an IV can be obtained using
use Config qw( %Config );
my $iv_bits = 8 * $Config{ivsize};
or
my $iv_bits = 8 * length pack 'j', 0;

perl inconsistent negative zero result

I have the following code :
my $m=0;
my $e =0 ;
my $g=0;
my $x= sprintf( "%0.1f", (0.6*$m+ 0.7 * $e-1.5)*$g);
print $x;
when I run the script the result is -0.0 and not 0.0 could someone explain why and how i can change it to be 0.0.
First, this has nothing to do with Perl. It's your processor that's returning -0.0. You'll see this same behaviour in other languages.
You ask why, presumably asking why this is useful. Honestly, I don't know. Some scientists and engineers probably take advantage of it.
+0.0 would indicate "zero or something very slightly larger on the positive side".
-0.0 would indicate "zero or something very slightly larger on the negative side."
You also ask how to get rid of the sign.
Negative zero is false, so $x || 0 does the trick.
You've run into something very strange. My first thought was that you were seeing some very small negative number that sprintf rounds to -0.0, but in fact the result of the expression is an actual negative zero.
Here's a simpler program that exhibits the same issue:
#!/usr/bin/perl
use strict;
use warnings;
my $x = -1.0 * 0.0;
my $y = -1.5 * 0.0;
printf "x = %f\n", $x;
printf "y = %f\n", $y;
and the output:
x = 0.000000
y = -0.000000
My best guess is that -1.0 * 0.0 is being computed at compile time, but -1.5 * 0.0 is being computed at execution time, and the computations are yielding different results. EDIT: Strike that; a modified version of the program that replaces all the constants with function calls has the same behavior.
I can avoid the negative zero display by adding these lines before the printf calls:
$x += 0.0;
$y += 0.0;
but that's ugly.
(Incidentally, I get the same results with the "bleading-edge" version of Perl 5.15.2 from about a month ago.)
A similar C program prints -0.000000 for both x and y.
EDIT: Further experiment shows that multiplying a negative integral value by 0.0 yields 0.0, but multiplying a negative non-integral value by 0.0 yields -0.0. I've submitted a Perl bug report.
Nothing to see here, move along...
Zero is represented by the exponent emin - 1 and a zero significand.
Since the sign bit can take on two different values, there are two
zeros, +0 and -0.
http://download.oracle.com/docs/cd/E19957-01/806-3568/ncg_goldberg.html
Data::Float has some useful information as well as routines to check if a floating point value is zero.
The short answer is, when dealing with floating point, you cannot assume algebraic identities will be preserved.
use strict;
use warnings;
use Data::Float qw(float_is_zero);
my $m = 0;
my $e = 0;
my $g = 0;
my $result = (0.6 * $m + 0.7 * $e - 1.5) * $g;
$result = 0.0 if float_is_zero($result);
my $x = sprintf( "%0.1f", $result);
print $x;
This doesn't address the post directly, but it does address the "odd" behavior that exists in perl.
(I believe) This issue is caused because perl is converting the numbers to integers and then using INTEGER/ALU math instead of FP/FPU math. However, there is no -0 integer [in two's complement] -- only an -0 integral which is really a floating point value -- so the floating point value -0.0 is converted to the integer 0 before the multiplication :-)
Here is my "demonstration":
printf "%.f\n", 2.0 * -0.0;
printf "%.f\n", 1.5 * -0.0;
printf "%.f\n", 1.0 * -0.0;
printf "%.f\n", 1e8 * -0.0;
printf "%.f\n", 1e42 * -0.0;
And my "result/reasoning" is:
0 # 2.0 -> 2 and -0.0 -> 0: INTEGER math
-0 # 1.5 is not an integral: FP math, no conversions
0 # 1.0 -> 1 and -0.0 -> 0: INTEGER math
0 # 1e8 -> 100000000 and -0.0 -> 0: INTEGER math
-0 # 1e42 is an integral OUTSIDE the range of integers: FP math, no conversions
Happy musings.
Python does not exhibit these quirks because it has strongly-typed numbers: it will not convert an integral floating point value to an integer prior to a math operation. (Python will still perform standard type-widening.) Try divide by 0.0 (FP, not INTEGER math!) in perl ;-)
Very strange. I note that the problem disappears if you replace 1.5 with a negative integer:
$ perl -e '
my #a=(-9.0, -3.0, -2.0, -1.5, -1.2, -1.0, -0.8, -0.5);
for my $a (#a) {
$bin = join("", map {sprintf("%02x", ord($_))} split(//, pack("d>", $a*0)));
printf("%4.1f * 0 = %4.1f %s\n", $a, $a*0, $bin);
}'
-9.0 * 0 = 0.0 0000000000000000
-3.0 * 0 = 0.0 0000000000000000
-2.0 * 0 = 0.0 0000000000000000
-1.5 * 0 = -0.0 8000000000000000
-1.2 * 0 = -0.0 8000000000000000
-1.0 * 0 = 0.0 0000000000000000
-0.8 * 0 = -0.0 8000000000000000
-0.5 * 0 = -0.0 8000000000000000
All I can think of is to treat -0.0 as a special case:
my $ans = (0.6*$m+ 0.7 * $e-1.5)*$g;
my $x= sprintf("%0.1f", $ans == -0.0 ? 0.0 : $ans)
(EDIT: This was a dumb suggestion, since -0.0 == 0.0.)
I also checked Python's behaviour, which consistently retains the sign, which suggests that the negative sign is not really a bug in Perl, just a little strange (though I'd say that treating integers and non-integers differently is a bug):
$ python -c '
for a in [-9.0, -3.0, -2.0, -1.5, -1.2, -1.0, -0.8, -0.5]:
print "%0.1f" % (a*0,)
'
-0.0
-0.0
-0.0
-0.0
-0.0
-0.0
-0.0
-0.0
Answer: use the absolute value function, abs()
Code
printf "%f\n", -0.0;
printf "%f\n", abs(-0.0);
Perl 5.10.1
-0.000000
0.000000
Perl 5.12.1
-0.000000
0.000000
Perl 6 (rakudo-2010.08)
0.000000
0.000000
IEEE 754 standard
abs(x) copies a floating-point operand x to a destination in the same
format, setting the sign bit to 0 (positive).
EDIT (addressing Justin's feedback):
my $result = possible_negative_zero();
$result = abs($result) if $result == 0.0; # because -0.0 == 0.0
printf "%f\n", $result;

Perl function for negative integers using the 2's complement

I am trying to convert AD maxpwdAge (a 64-bit integer) into a number of days.
According to Microsoft:
Uses the IADs interface's Get method to retrieve the value of the domain's maxPwdAge attribute (line 5).
Notice we use the Set keyword in VBScript to initialize the variable named objMaxPwdAge—the variable used to store the value returned by Get. Why is that?
When you fetch a 64-bit large integer, ADSI does not return one giant scalar value. Instead, ADSI automatically returns an IADsLargeInteger object. You use the IADsLargeInteger interface's HighPart and LowPart properties to calculate the large integer's value. As you may have guessed, HighPart gets the high order 32 bits, and LowPart gets the low order 32 bits. You use the following formula to convert HighPart and LowPart to the large integer's value.
The existing code in VBScript from the same page:
Const ONE_HUNDRED_NANOSECOND = .000000100 ' .000000100 is equal to 10^-7
Const SECONDS_IN_DAY = 86400
Set objDomain = GetObject("LDAP://DC=fabrikam,DC=com") ' LINE 4
Set objMaxPwdAge = objDomain.Get("maxPwdAge") ' LINE 5
If objMaxPwdAge.LowPart = 0 Then
WScript.Echo "The Maximum Password Age is set to 0 in the " & _
"domain. Therefore, the password does not expire."
WScript.Quit
Else
dblMaxPwdNano = Abs(objMaxPwdAge.HighPart * 2^32 + objMaxPwdAge.LowPart)
dblMaxPwdSecs = dblMaxPwdNano * ONE_HUNDRED_NANOSECOND ' LINE 13
dblMaxPwdDays = Int(dblMaxPwdSecs / SECONDS_IN_DAY) ' LINE 14
WScript.Echo "Maximum password age: " & dblMaxPwdDays & " days"
End If
How can I do this in Perl?
Endianness may come into this, but you may be able to say
#!/usr/bin/perl
use strict;
use warnings;
my $num = -37_108_517_437_440;
my $binary = sprintf "%064b", $num;
my ($high, $low) = $binary =~ /(.{32})(.{32})/;
$high = oct "0b$high";
$low = oct "0b$low";
my $together = unpack "q", pack "LL", $low, $high;
print "num $num, low $low, high $high, together $together\n";
Am I missing something? As far as I can tell from your question, your problem has nothing at all to do with 2’s complement. As far as I can tell, all you need/want to do is
use Math::BigInt;
use constant MAXPWDAGE_UNIT_PER_SEC => (
1000 # milliseconds
* 1000 # microseconds
* 10 # 100 nanoseconds
);
use constant SECS_PER_DAY => (
24 # hours
* 60 # minutes
* 60 # seconds
);
my $maxpwdage_full = ( Math::BigInt->new( $maxpwdage_highpart ) << 32 ) + $maxpwdage_lowpart;
my $days = $maxpwdage_full / MAXPWDAGE_UNIT_PER_SEC / SECS_PER_DAY;
Note that I deliberately use 2 separate constants, and I divide by them in sequence, because that keeps the divisors smaller than the range of a 32-bit integer. If you want to write this another way and you want it to work correctly on 32-bit perls, you’ll have to keep all the precision issues in mind.