How to perform operation on specific bits in perl - perl

Lets assume I have a hexadecimal value 0x78. I need to add 1 to first 4 bits ie 3:0 and add 2 to last 4 bits ie. [7:4]. Further when I add 1 to 0xF it should not roll over to the next value and should stay as 0xF. Same applies for subtraction. The approach I have tried so far is:
$byte=0x78;
$byte2 = unpack('b4', $byte);
print "byte2 = $byte2 \n";
--> Here the output is 1000 so I have tried to extract the first 4 bits, and similarly I can right shift and extract last 4 bits and perform the operation.
But to perform addition or subtraction, I wanted to convert 1000 back to hex format so that I can do 0x8 +/- 1. For that I tried:
$hex2 = sprintf('%02x', $byte2);
print "hex2 = $hex2 \n";
--> Output is 3e8. I do not understand why I get 3e8 instead of just 8 or 08, since it is supposed to print only 2 values in hex format.
In the above command when I manually enter
$hex2 = sprintf('%02x', 0b1000); I get the correct result. So perl is taking it as a string rather than a numeric value. Is there some way I can convert that string to a binary number? Any other easier method or approach would be helpful.

We can get each byte by ANDing and shifting:
$byte1 = $byte & 0xf;
$byte2 = ($byte & 0xf0) >> 4;
printf "byte1: 0x%x\n", $byte1;
printf "byte2: 0x%x\n", $byte2;
# prints
byte1: 0x8
byte2: 0x7
Addition/subtraction with special conditions you listed can be done on these bytes and the new value can be reconstructed with shifts and addition:
($byte1 < 0xf) ? ($byte1 += 1) : ($byte1 = 0xf);
($byte2 < 0xe) ? ($byte2 += 2) : ($byte2 = 0xf);
# or do subtraction stuff.
$new_val = ($byte2 << 4) + $byte1;
printf "new val: 0x%x\n", $new_val;
# prints
new val: 0x99
You're getting '3e8' because $byte2 is '1000', which, when translated into hex is '0x3e8'.

I think you're better off with something like:
sub byte_to_two_nibbles($) {
my $byte = shift;
return int($byte / 16), ($byte % 16);
}
sub two_nibbles_to_byte($$) {
return $_[0] * 16 + $_[1];
}
my ($msn, $lsn) = byte_to_two_nibbles 0x78;
$msn += 1; $msn = 15 if $msn > 15;
$lsn += 2; $lsn = 15 if $lsn > 15;
my $result = two_nibbles_to_byte $msn, $lsn;

You can use oct function:
$byte2 = oct("0b$byte2");
my $hex2 = sprintf('%02x', $byte2);
print "hex2 = $hex2 \n";
Prints:
hex2 = 08

Related

In perl, how do I count bits in a bit vector which has bits set higher than 2_147_483_639?

Perl is pretty great at doing bit strings/vectors. Setting bits is as easy as
vec($bit_string, 123, 1) = 1;
Getting the count of set bits is lightning quick
$count = unpack("%32b*", $bit_string);
But if you set a bit above 2_147_483_639, your count will silently go to zero without any apparent warning or error.
Is there any way around this?
The following code demonstrates the problem
#!/usr/bin/env perl
# create a string to use as our bit vector
my $bit_string = undef;
# set bits a position 10 and 2_000_000_000
# and the apparently last valid integer position 2_147_483_639
vec($bit_string, 10, 1) = 1;
vec($bit_string, 2_000_000_000, 1) = 1;
vec($bit_string, 2_147_483_639, 1) = 1;
# get a count of the bits which are set
my $bit_count = unpack("%32b*", $bit_string);
print("Bits set in bit string: $bit_count\n");
## Bits set in bit string: 3
# check the bits at positions 10, 11, 2_000_000_000, 2_147_483_639
for my $position (10,11,2_000_000_000, 2_147_483_639) {
my $bit_value = vec($bit_string, $position, 1);
print("Bit at $position is $bit_value\n");
}
## Bit at 10 is 1
## Bit at 11 is 0
## Bit at 2000000000 is 1
## Bit at 2147483639 is 1
# Adding the next highest bit, 2_147_483_640, causes the count to become 0
# with no complaint, error or warning
vec($bit_string, 2_147_483_640, 1) = 1;
$bit_count = unpack("%32b*", $bit_string);
print("Bits set in bit string after setting bit 2_147_483_640: $bit_count\n");
## Bits set in bit string after setting bit 2_147_483_640: 0
# But the bits are still actually set
for my $position (10, 2_000_000_000, 2_147_483_639, 2_147_483_640) {
my $bit_value = vec($bit_string, $position, 1);
print("Bit at $position is $bit_value\n");
}
## Bit at 10 is 1
## Bit at 2000000000 is 1
## Bit at 2147483639 is 1
## Bit at 2147483640 is 1
# Set even higher bits
vec($bit_string, 3_000_000_000, 1) = 1;
vec($bit_string, 4_000_000_000, 1) = 1;
# verify these are also set
for my $position (3_000_000_000, 4_000_000_000) {
my $bit_value = vec($bit_string, $position, 1);
print("Bit at $position is $bit_value\n");
}
## Bit at 3000000000 is 1
## Bit at 4000000000 is 1
You can try counting by smaller pieces. It's slower, but it seems to work:
$bit_count = 0;
$bit_count += unpack '%32b*', $1
while $bit_string =~ /(.{1,32766})/g;
Or slightly faster using substr instead of m//:
$bit_count = 0;
my ($pos, $step) = (0, 2 ** 17);
$bit_count += unpack '%32b*', substr $bit_string, $step * $pos++, $step
while $pos * $step <= length $bit_string;
2 ** 17 seems to give the best performance on my machine, but YMMV.
Another possibility (slower, BTW) is to do a table of number of bits for any possible byte and use that:
my %by_bits;
for my $byte (1 ..255) {
my $bits_in_byte = sprintf('%b', $byte) =~ tr/1//; # Fix SO hiliting bug: /
$by_bits{$bits_in_byte} .= sprintf '\\x%02x', $byte;
}
$bit_count = 0;
for my $count (keys %by_bits) {
$bit_count += $count * eval('$bit_string =~ tr/' . $by_bits{$count}. '//');
}
Update:
It works correctly in recent Perl. See Another 32-bit residual in 64-bit perl 5.18.

Perl script to convert a binary number to a decimal number

I have to write a Perl script that converts a binary number, specified as an
argument, to a decimal number. In the question there's a hint to use the reverse function.
We have to assume that the binary number is in this format
EDIT: This is what I've progressed to (note this is code from my textbook that I've messed with):
#!/usr/bin/perl
# dec2.pl: Converts decimal number to binary
#
die("No arguments\n") if ( $#ARGV == -1 ) ;
foreach $number (#ARGV) {
$original_number = $number ;
until ($number == 0 ) {
$bit = $number % 2 ;
unshift (#bit_arr, $bit) ;
$number = int($number / 2 );
}
$binary_number = join ("", #bit_arr) ;
print reverse ("The decimal number of $binary_number is $original_number\n");
$#bit_arr = -1;
}
When executed:
>./binary.pl 8
The decimal number of 1000 is 8
I don't know how to word it to make the program know to add up all of the 1's in the number that is inputted.
You could just use sprintf to do the converting for you...
sprintf("%d", 0b010101); # Binary string 010101 -> Decimal 21
sprintf("%b", 21) # Decimal 21 -> Binary 010101 string
Of course, you can also just eval a binary string with 0b in front to indicate binary:
my $binary_string = '010101';
my $decimal = eval("0b$binary"); # 21
You don't have to use reverse, but it makes it easy to think about the problem with respect to exponents and array indices.
use strict;
use warnings;
my $str = '111110100';
my #bits = reverse(split(//, $str));
my $sum = 0;
for my $i (0 .. $#bits) {
next unless $bits[$i];
$sum += 2 ** $i;
}
First of all, you are suppose to convert from a binary to decimal, not the other way around, which you means you take an input like $binary = '1011001';.
The first thing you need to do is obtain the individual bits (a0, a1, etc) from that. We're talking about splitting the string into its individual digits.
for my $bit (split(//, $binary)) {
...
}
That should be a great starting point. With that, you have all that you need to apply the following refactoring of the formula you posted:
n = ( ( ( ... )*2 + a2 )*2 + a1 )*2 + a0
[I have no idea why reverse would be recommended. It's possible to use it, but it's suboptimal.]

Can you extend pack() to handle custom, variable length fields?

The Bitcoin protocol, in order to save space, encodes their integers using what they call variable length integers or varints. The first byte of the varint encodes its length and its interpretation:
FirstByte Value
< 0xfd treat the byte itself as an 8 bit integer
0xfd next 2 bytes form a 16 bit integer
0xfe next 4 bytes form a 32 bit integer
0xff next 8 bytes form a 64 bit integer
(All ints are little endian and unsigned). I wrote the following function to unpack varints:
my $varint = "\xfd\x00\xff"; # \x00\xff in little endian == 65280
say unpack_varint($varint); # print 65280
sub unpack_varint{
my $v = shift;
my $first_byte = unpack "C", $v;
say $first_byte;
if ($first_byte < 253) { # \xfd == 253
return $first_byte;
}
elsif ($first_byte == 253){
return unpack "S<", substr $v, 1, 2;
}
elsif ($first_byte == 254){
return unpack "L<", substr $v, 1, 4;
}
elsif ($first_byte == 255){
return unpack "Q<", substr $v, 1, 8;
}
else{
die "error";
}
}
This works... but its very inelegant b/c if I have a long bytestring with embedded varints, I would have to read up to the beginning of the varint, pass the remainder to the function above, find out how long the encoded varint was, etc. etc. Is there a better way to write this? In particular, can I somehow extend pack() to support this kind of structure?
You can create a set of shift_$type functions that read and delete some value at the beginning of the given string, so your code becomes something as the following:
my $buffer = ...;
my $val1 = shift_varint($buffer);
my $val2 = shift_string($buffer);
my $val3 = shift_uint32($buffer);
...
You can also add a multirecord "shifter":
my ($val1, $val2, $val3) = shift_multi($buffer, qw(varint string uint32));
If you need more speed you could also write a compiler which can convert a set of types into an unpacker sub.

Perl: test for an arbitrary bit in a bit string

I'm trying to parse CPU node affinity+cache sibling info in Linyx sysfs.
I can get a string of bits, just for example:
0000111100001111
Now I need a function where I have a decimal number (e.g. 4 or 5) and I need to test whether the nth bit is set or not. So it would return true for 4 and false for 5. I could create a string by shifting 1 n number of times, but I'm not sure about the syntax, and is there an easier way? Also, there's no limit on how long the string could be, so I want to avoid decimal <-> binary conversoins.
Assuming that you have the string of bits "0000111100001111" in $str, if you do the precomputation step:
my $bit_vector = pack "b*", $str;
you can then use vec like so:
$is_set = vec $bit_vector, $offset, 1;
so for example, this code
for (0..15) {
print "$_\n" if vec $bit_vector, $_, 1;
}
will output
4
5
6
7
12
13
14
15
Note that the offsets are zero-based, so if you want the first bit to be bit 1, you'll need to add/subtract 1 yourself.
Well, this seems to work, and I'm not going for efficiency:
sub is_bit_set
{
my $bitstring = shift;
my $bit = shift;
my $index = length($bitstring) - $bit - 1;
if (substr($bitstring, $index, 1) == "1") {
return 1;
}
else {
return 0;
}
}
Simpler variant without bit vector, but for sure vector would be more efficient way to deal.
sub is_bit_set
{
my $bitstring = shift;
my $bit = shift;
return int substr($bitstring, -$bit, 1);
}

How can I convert four characters into a 32-bit IEEE-754 float in Perl?

I have a project where a function receives four 8-bit characters and needs to convert the resulting 32-bit IEEE-754 float to a regular Perl number. It seems like there should be a faster way than the working code below, but I have not been able to figure out a simpler pack function that works.
It does not work, but it seems like it is close:
$float = unpack("f", pack("C4", #array[0..3]); # Fails for small numbers
Works:
#bits0 = split('', unpack("B8", pack("C", shift)));
#bits1 = split('', unpack("B8", pack("C", shift)));
#bits2 = split('', unpack("B8", pack("C", shift)));
#bits3 = split('', unpack("B8", pack("C", shift)));
push #bits, #bits3, #bits2, #bits1, #bits0;
$mantbit = shift(#bits);
$mantsign = $mantbit ? -1 : 1;
$exp = ord(pack("B8", join("",#bits[0..7])));
splice(#bits, 0, 8);
# Convert fractional float to decimal
for (my $i = 0; $i < 23; $i++) {
$f = $bits[$i] * 2 ** (-1 * ($i + 1));
$mant += $f;
}
$float = $mantsign * (1 + $mant) * (2 ** ($exp - 127));
Anyone have a better way?
I'd take the opposite approach: forget unpacking, stick to bit twiddling.
First, assemble your 32 bit word. Depending on endianness, this might have to be the other way around:
my $word = ($byte0 << 24) + ($byte1 << 16) + ($byte2 << 8) + $byte3;
Now extract the parts of the word: the sign bit, exponent and mantissa:
my $sign = ($word & 0x80000000) ? -1 : 1;
my $expo = (($word & 0x7F800000) >> 23) - 127;
my $mant = ($word & 0x007FFFFF | 0x00800000);
Assemble your float:
my $num = $sign * (2 ** $expo) * ( $mant / (1 << 23));
There's some examples on Wikipedia.
Tested this on 0xC2ED4000 => -118.625 and it works.
Tested this on 0x3E200000 => 0.15625 and found a bug! (fixed)
Don't forget to handle infinities and NaNs when $expo == 255
The best way to do this is to use pack().
my #bytes = ( 0xC2, 0xED, 0x40, 0x00 );
my $float = unpack 'f', pack 'C4', #bytes;
Or if the source and destination have different endianness:
my $float = unpack 'f', pack 'C4', reverse #bytes;
You say that this method "does not work - it seems like it is close" and "fails for small numbers", but you don't give an example. I'd guess that what you are actually seeing is rounding where, for example, a number is packed as 1.234, but it is unpacked as 1.23399996757507. That isn't a function of pack(), but of the precision of a 4-byte float.