Expression for setting lowest n bits that works even when n equals word size - perl

NB: the purpose of this question is to understand Perl's bitwise operators better. I know of ways to compute the number U described below.
Let $i be a nonnegative integer. I'm looking for a simple expression E<$i>1 that will evaluate to the unsigned int U, whose $i lowest bits are all 1's, and whose remaining bits are all 0's. E.g. E<8> should be 255. In particular, if $i equals the machine's word size (W), E<$i> should equal ~02.
The expressions (1 << $i) - 1 and ~(~0 << $i) both do the right thing, except when $i equals W, in which case they both take on the value 0, rather than ~0.
I'm looking for a way to do this that does not require computing W first.
EDIT: OK, I thought of an ugly, plodding solution
$i < 1 ? 0 : do { my $j = 1 << $i - 1; $j < $j << 1 ? ( $j << 1 ) - 1 : ~0 }
or
$i < 1 ? 0 : ( 1 << ( $i - 1 ) ) < ( 1 << $i ) ? ( 1 << $i ) - 1 : ~0
(Also impractical, of course.)
1 I'm using the strange notation E<$i> as shorthand for "expression based on $i".
2 I don't have a strong preference at the moment for what E<$i> should evaluate to when $i is strictly greater than W.

On systems where eval($Config{nv_overflows_integers_at}) >= 2**($Config{ptrsize*8}) (which excludes one that uses double-precision floats and 64-bit ints),
2**$i - 1
On all systems,
( int(2**$i) - 1 )|0
When i<W, int will convert the NV into an IV/UV, allowing the subtraction to work on systems with the precision of NVs is less than the size of UVs. |0 has no effect in this case.
When i≥W, int has no effect, so the subtraction has no effect. |0 therefore overflows, in which case Perl returns the largest integer.
I don't know how reliable that |0 behaviour is. It could be compiler-specific. Don't use this!

use Config qw( %Config );
$i >= $Config{uvsize}*8 ? ~0 : ~(~0 << $i)
Technically, the word size is looked up, not computed.

Fun challenge!
use Devel::Peek qw[Dump];
for my $n (8, 16, 32, 64) {
Dump(~(((1 << ($n - 1)) << 1) - 1) ^ ~0);
}
Output:
SV = IV(0x7ff60b835508) at 0x7ff60b835518
REFCNT = 1
FLAGS = (PADTMP,IOK,pIOK)
IV = 255
SV = IV(0x7ff60b835508) at 0x7ff60b835518
REFCNT = 1
FLAGS = (PADTMP,IOK,pIOK)
IV = 65535
SV = IV(0x7ff60b835508) at 0x7ff60b835518
REFCNT = 1
FLAGS = (PADTMP,IOK,pIOK)
IV = 4294967295
SV = IV(0x7ff60b835508) at 0x7ff60b835518
REFCNT = 1
FLAGS = (PADTMP,IOK,pIOK,IsUV)
UV = 18446744073709551615
Perl compiled with:
ivtype='long', ivsize=8, nvtype='double', nvsize=8

The documentation on the shift operators in perlop has an answer to your problem: use bigint;.
From the documentation:
Note that both << and >> in Perl are implemented directly using << and >> in C. If use integer (see Integer Arithmetic) is in force then signed C integers are used, else unsigned C integers are used. Either way, the implementation isn't going to generate results larger than the size of the integer type Perl was built with (32 bits or 64 bits).
The result of overflowing the range of the integers is undefined because it is undefined also in C. In other words, using 32-bit integers, 1 << 32 is undefined. Shifting by a negative number of bits is also undefined.
If you get tired of being subject to your platform's native integers, the use bigint pragma neatly sidesteps the issue altogether:
print 20 << 20; # 20971520
print 20 << 40; # 5120 on 32-bit machines,
# 21990232555520 on 64-bit machines
use bigint;
print 20 << 100; # 25353012004564588029934064107520

Related

How to isolate leftmost bytes in integer

This has to be done in Perl:
I have integers on the order of e.g. 30_146_890_129 and 17_181_116_691 and 21_478_705_663.
These are supposedly made up of 6 bytes, where:
bytes 0-1 : value a
bytes 2-3 : value b
bytes 4-5 : value c
I want to isolate what value a is. How can I do this in Perl?
I've tried using the >> operator:
perl -e '$a = 330971351478 >> 16; print "$a\n";'
5050222
perl -e '$a = 17181116691 >> 16; print "$a\n";'
262163
But these numbers are not on the order of what I am expecting, more like 0-1000.
Bonus if I can also get values b and c but I don't really need those.
Thanks!
number >> 16 returns number shifted by 16 bit and not the shifted bits as you seem to assume. To get the last 16 bit you might for example use number % 2**16 or number & 0xffff. To get to b and c you can just shift before getting the last 16 bits, i.e.
$a = $number & 0xffff;
$b = ($number >> 16) & 0xffff;
$c = ($number >> 32) & 0xffff;
If you have 6 bytes, you don't need to convert them to a number first. You can use one the following depending on the order of the bytes: (Uppercase represents the most significant byte.)
my ($num_c, $num_b, $num_a) = unpack('nnn', "\xCC\xcc\xBB\xbb\xAA\xaa");
my ($num_a, $num_b, $num_c) = unpack('nnn', "\xAA\xaa\xBB\xbb\xAA\xaa");
my ($num_c, $num_b, $num_a) = unpack('vvv', "\xcc\xCC\xbb\xBB\xaa\xAA");
my ($num_a, $num_b, $num_c) = unpack('vvv', "\xaa\xAA\xbb\xBB\xcc\xCC");
If you are indeed provided with a number 0xCCccBBbbAAaa), you can convert it to bytes then extract the numbers you want from it as follows:
my ($num_c, $num_b, $num_a) = unpack('xxnnn', pack('Q>', $num));
Alternatively, you could also use an arithmetic approach like you attempted.
my $num_a = $num & 0xFFFF;
my $num_b = ( $num >> 16 ) & 0xFFFF;
my $num_c = $num >> 32;
While the previous two solutions required a Perl built to use 64-bit integers, the following will work with any build of Perl:
my $num_a = $num % 2**16;
my $num_b = ( $num / 2**16 ) % 2**16;
my $num_c = int( $num / 2**32 );
Let's look at ( $num >> 16 ) & 0xFFFF in detail.
Original number: 0x0000CCccBBbbAAaa
After shifting: 0x00000000CCccBBbb
After masking: 0x000000000000BBbb

How to convert 2 unsigned 16bit integers into a signed 32bit integer in Perl

I have a device that is able to send me data only as unsigned 16bit registers, using Perl.
I have to take 2 registers and make a 32 bit signed integer out of them.
My challenge is to represent a negative value having two positives.
Method 1:
my $int32 = unpack('l>', pack('nn', $hi16, $lo16));
Method 2:
my $int32 = ( $hi16 << 16 ) | $lo16;
$int32 -= 2**32 if $int32 >= 2**31;
For example,
$ perl -e'
use feature qw( say );
my $hi16 = 0xFFFF;
my $lo16 = 0xFFFD;
say $hi16;
say $lo16;
{
my $int32 = unpack("l>", pack("nn", $hi16, $lo16));
say $int32;
}
{
my $int32 = ( $hi16 << 16 ) | $lo16;
$int32 -= 2**32 if $int32 >= 2**31;
say $int32;
}
'
65535
65533
-3
-3

Maximum integer in Perl

Set $i=0 and do ++$i while it increases. Which number we would reach?
Note that it may be not the same as maximum integer in Perl (as asked in the title), because there may be gaps between adjacent integers which are greater than 1.
"Integer" can refer to a family of data types (int16_t, uint32_t, etc). There's no gap in the numbers these can represent.
"Integer" can also refer to numbers without a fractional component, regardless of the type of the variable used to store it. ++ will seamlessly transition between data types, so this is what's relevant to this question.
Floating point numbers can store integers in this sense, and it's possible to store very large numbers as floats without being able to add one to them. The reason for this is that floating pointer numbers are stored using the following form:
[+/-]1._____..._____ * 2**____
For example, let's say the mantissa of your floats can store 52 bits after the decimal, and you want to add 1 to 2**53.
__52 bits__
/ \
1.00000...00000 * 2**53 Large power of two
+ 1.00000...00000 * 2**0 1
--------------------------
1.00000...00000 * 2**53
+ 0.00000...000001 * 2**53 Normalized exponents
--------------------------
1.00000...00000 * 2**53
+ 0.00000...00000 * 2**53 What we really get due to limited number of bits
--------------------------
1.00000...00000 * 2**53 Original large power of two
So it is possible to hit a gap when using floating point numbers. However, you started with a number stored as signed integer.
$ perl -MB=svref_2object,SVf_IVisUV,SVf_NOK -e'
$i = 0;
$sv = svref_2object(\$i);
print $sv->FLAGS & SVf_NOK ? "NV\n" # Float
: $sv->FLAGS & SVf_IVisUV ? "UV\n" # Unsigned int
: "IV\n"; # Signed int
'
IV
++$i will leave the number as a signed integer value ("IV") until it cannot anymore. At that point, it will start using an unsigned integer values ("UV").
$ perl -MConfig -MB=svref_2object,SVf_IVisUV,SVf_NOK -e'
$i = hex("7F".("FF"x($Config{ivsize}-2))."FD");
$sv = svref_2object(\$i);
for (1..4) {
++$i;
printf $sv->FLAGS & SVf_NOK ? "NV %.0f\n"
: $sv->FLAGS & SVf_IVisUV ? "UV %u\n"
: "IV %d\n", $i;
}
'
IV 2147483646
IV 2147483647 <-- 2**31 - 1 Largest IV
UV 2147483648
UV 2147483649
or
IV 9223372036854775806
IV 9223372036854775807 <-- 2**63 - 1 Largest IV
UV 9223372036854775808
UV 9223372036854775809
Still no gap because no floating point numbers have been used yet. But Perl will eventually use floating point numbers ("NV") because they have a far larger range than integers. ++$i will switch to using a floating point number when it runs out of unsigned integers.
When that happens depends on your build of Perl. Not all builds of Perl have the same integer and floating point number sizes.
On one machine:
$ perl -V:[in]vsize
ivsize='4'; # 32-bit integers
nvsize='8'; # 64-bit floats
On another:
$ perl -V:[in]vsize
ivsize='8'; # 64-bit integers
nvsize='8'; # 64-bit floats
On a system where nvsize is larger than ivsize
On these systems, the first gap will happen above the largest unsigned integer. If your system uses IEEE double-precision floats, your floats have 53-bit of precision. They can represent without loss all integers from -253 to 253 (inclusive). ++ will fail to increment beyond that.
$ perl -MConfig -MB=svref_2object,SVf_IVisUV,SVf_NOK -e'
$i = eval($Config{nv_overflows_integers_at}) - 3;
$sv = svref_2object(\$i);
for (1..4) {
++$i;
printf $sv->FLAGS & SVf_NOK ? "NV %.0f\n"
: $sv->FLAGS & SVf_IVisUV ? "UV %u\n"
: "IV %d\n", $i;
}
'
NV 9007199254740990
NV 9007199254740991
NV 9007199254740992 <-- 2**53 Requires 1 bit of precision as a float
NV 9007199254740992 <-- 2**53 + 1 Requires 54 bits of precision as a float
but only 53 are available.
On a system where nvsize is no larger than ivsize
On these systems, the first gap will happen before the largest unsigned integer. Switching to floating pointer numbers will allow you to go one further (a large power of two), but that's it. ++ will fail to increment beyond the largest unsigned integer + 1.
$ perl -MConfig -MB=svref_2object,SVf_IVisUV,SVf_NOK -e'
$i = hex(("FF"x($Config{ivsize}-1))."FD");
$sv = svref_2object(\$i);
for (1..4) {
++$i;
printf $sv->FLAGS & SVf_NOK ? "NV %.0f\n"
: $sv->FLAGS & SVf_IVisUV ? "UV %u\n"
: "IV %d\n", $i;
}
'
UV 18446744073709551614
UV 18446744073709551615 <-- 2**64 - 1 Largest UV
NV 18446744073709551616 <-- 2**64 Requires 1 bit of precision as a float
NV 18446744073709551616 <-- 2**64 + 1 Requires 65 bits of precision as a float
but only 53 are available.
This is on 32-bit perl,
perl -e "$x=2**53-5; printf qq{%.f\n}, ++$x for 1..10"
9007199254740988
9007199254740989
9007199254740990
9007199254740991
9007199254740992
9007199254740992
9007199254740992
9007199254740992
9007199254740992
9007199254740992
Well, on my 64-bit machine it's 18446744073709551615 (much easier as ~0), after which it increases once more time to 1.84467440737096e+19 and stops incrementing.

' << ' operator in verilog

i have a verilog code in which there is a line as follows:
parameter ADDR_WIDTH = 8 ;
parameter RAM_DEPTH = 1 << ADDR_WIDTH;
here what will be stored in RAM_DEPTH and what does the << operator do here.
<< is a binary shift, shifting 1 to the left 8 places.
4'b0001 << 1 => 4'b0010
>> is a binary right shift adding 0's to the MSB.
>>> is a signed shift which maintains the value of the MSB if the left input is signed.
4'sb1011 >> 1 => 0101
4'sb1011 >>> 1 => 1101
Three ways to indicate left operand is signed:
module shift;
logic [3:0] test1 = 4'b1000;
logic signed [3:0] test2 = 4'b1000;
initial begin
$display("%b", $signed(test1) >>> 1 ); //Explicitly set as signed
$display("%b", test2 >>> 1 ); //Declared as signed type
$display("%b", 4'sb1000 >>> 1 ); //Signed constant
$finish;
end
endmodule
1 << ADDR_WIDTH means 1 will be shifted 8 bits to the left and will be assigned as the value for RAM_DEPTH.
In addition, 1 << ADDR_WIDTH also means 2^ADDR_WIDTH.
Given ADDR_WIDTH = 8, then 2^8 = 256 and that will be the value for RAM_DEPTH
<< is the left-shift operator, as it is in many other languages.
Here RAM_DEPTH will be 1 left-shifted by 8 bits, which is equivalent to 2^8, or 256.

How do I determine the maximum range for perl's range iterator?

I can exceed perl's range iteration bounds like so, with or without -Mbigint:
$» perl -E 'say $^V; say for (0..shift)' 1e19
v5.16.2
Range iterator outside integer range at -e line 1.
How can I determine this upper limit, without simply trying until I exceed it?
It's an IV.
>> similarly works on integers, so you can use
my $max_iv = -1 >> 1;
my $min_iv = -(-1 >> 1) - 1;
They can also be derived from the size of an IV.
my $max_iv = (1 << ($iv_bits-1)) - 1;
my $min_iv = -(1 << ($iv_bits-1));
The size of an IV can be obtained using
use Config qw( %Config );
my $iv_bits = 8 * $Config{ivsize};
or
my $iv_bits = 8 * length pack 'j', 0;