Can you explain the bits I'm getting from unpack? - perl

I'm relatively inexperienced with Perl, but my question concerns the unpack function when getting the bits for a numeric value. For example:
my $bits = unpack("b*", 1);
print $bits;
This results in 10001100 being printed, which is 140 in decimal. In the reverse order it's 49 in decimal. Any other values I've tried seem to give the incorrect bits.
However, when I run $bits through pack, it produces 1 again. Is there something I'm missing here?
It seems that I jumped to conclusions when I thought my problem was solved. Maybe I should briefly explain what it is I'm trying do.
I need to convert an integer value that could be as big as 24 bits long (the point being that it could be bigger than one byte) into a bit string. This much can be accomplished using unpack and pack as suggested by #ikegami, but I also need to find a way to convert that bit string back into it's original integer (not a string representation of it).
As I mentioned, I'm relatively inexperienced with Perl, and I've been trying with no success.
I found what seems to be an optimal solution:
my $bits = sprintf("%032b", $num);
print "$bits\n";
my $orig = unpack("N", pack("B32", substr("0" x 32 . $bits, -32)));
print "$orig\n";

This might be obvious, but the other answers haven't pointed it out explicitly: The second argument in unpack("b*", 1) is being typecast to the string "1", which has an ASCII value of 31 in hex (with the most significant nibble first).
The corresponding binary would be 00110001, which is reversed to 10001100 in your output because you used "b*" instead of "B*". These correspond to the opposite "endian" forms of the binary representation. "Endian-ness" is just whether the most-significant bits go at the start or the end of the binary representation.

Yes, you're missing that different machines support different "endianness". And Perl is treating 1 like '1' so ( 0x31 ). So, you're seeing 1 -> 1000 (in ascending order) and 3 -> 1100.
"Wrong" depends on perspective and whether or not you gave Perl enough information to know what encoding and endianness you wanted.
From pack:
b A bit string (ascending bit order inside each byte, like vec()).
B A bit string (descending bit order inside each byte).
I think this is what you want:
unpack( 'B*', chr(1))

You're trying to convert an integer to binary and then back. While you can do that with pack and then unpack, the better way is to use sprintf or printf with the %b format:
my $int = 5;
my $bits = sprintf "%024b\n", $int;
print "$bits\n";
To go the other way (converting a string of 0s & 1s to an integer), the best way is to use the oct function with a 0b prefix:
my $orig = oct("0b$bits");
print "$orig\n";
As the others explained, unpack expects a string to unpack, so if you have an integer, you first have to pack it into a string. The %b format expects an integer to begin with.
If you need to do a lot of this on bytes, and speed is crucial, you could build a lookup table:
my #binary = map { sprintf '%08b', $_ } 0 .. 255;
print $binary[$int]; # Assuming $int is between 0 and 255

The ord(1) is 49. You must want something like sprintf("%064b", 1), although that does seem like overkill.

You didn't specify what you expect. I'm guessing you're expecting 00000001.
That's the correct bits for the byte you provided, at least on non-EBCDIC systems. Remember, the input of unpack is a string (mostly strings of bytes). Perhaps you wanted
unpack('b*', pack('C', 1))
Update: As others have pointed out, the above gives 10000000. For 00000001, you'd use
unpack('B*', pack('C', 1)) # 00000001

You want "B" instead of "b".
$ perl -E'say unpack "b*", "1"'
10001100
$ perl -E'say unpack "B*", "1"'
00110001
pack

Related

Why does perl's unpack() think that the second argument is a string?

And how do I fix it?
If I do the following:
print unpack("B8", 7) . "\n";
I get the following output:
00110111
The expected output is of course 00000111. I've checked, and it's giving me ascii "7", the string. I'm able to fix it poorly by wrapping the 7 in a chr():
print unpack("B8", chr(7)) . "\n";
Of course, this will only work if my input remains below 255, and I suspect it may go into the low thousands (I'll make the "B8" dynamic too).
I know I'm being obtuse, but I've read the docs on this and they make no mention of it. Its reverse function, pack(), seems to interpret the second argument correctly.
unpack unpacks a string of bytes into scalars with the values represented by those bytes.
$ perl -E'say for unpack("nB8", "\x12\x34\x56")'
4660
01010110
You're looking for
sprintf("%08B", 7)

Convert number to hex

I use sprintf for conversion to hex - example >>
$hex = sprintf("0x%x",$d)
But I was wondering, if there is some alternative way how to do it without sprintf.
My goal is convert a number to 4-byte hex code (e.g. 013f571f)
Additionally (and optionally), how can I do such conversion, if number is in 4 * %0xxxxxxx format, using just 7 bits per byte?
sprintf() is probably the most appropriate way. According to http://perldoc.perl.org/functions/hex.html:
To present something as hex, look into printf, sprintf, and unpack.
I'm not really sure about your second question, it sounds like unpack() would be useful there.
My goal is convert a number to 4-byte hex code (e.g. 013f571f)
Hex is a textual representation of a number. sprintf '%X' returns hex (the eight characters 013f571f). sprintf is specifically designed to format numbers into text, so it's a very elegant solution for that.
...But it's not what you want. You're not looking for hex, you're looking for the 4-byte internal storage of an integer. That has nothing to do with hex.
pack 'N', 0x013f571f; # "\x01\x3f\x57\x1f" Big-endian byte order
pack 'V', 0x013f571f; # "\x1f\x57\x3f\x01" Little-endian byte order
sprintf() is my usual way of performing this conversion. You can do it with unpack, but it will probably be more effort on your side.
For only working with 4 byte values, the following will work though (maybe not as elegant as expected!):
print unpack("H8", pack("N1", $d));
Be aware that this will result in 0xFFFFFFFF for numbers bigger than that as well.
For working pack/unpack with arbitrary bit length, check out http://www.perlmonks.org/?node_id=383881
The perlpacktut will be a handy read as well.
For 4 * %0xxxxxxx format, my non-sprintf solution is:
print unpack("H8", pack("N1",
(((($d>>21)&0x7f)<<24) + ((($d>>14)&0x7f)<<16) + ((($d>>7)&0x7f)<<8) + ($d&0x7f))));
Any comments and improvements are very welcome.

Convert bit vector to binary in Perl

I'm not sure of the best way to describe this.
Essentially I am attempting to write to a buffer which requires a certain protocol. The first two bytes I would like to are "10000001" and "11111110" (bit by bit). How can I write these two bytes to a file handle in Perl?
To convert spelled-out binary to actual bytes, you want the pack function with either B or b (depending on the order you have the bits in):
print FILE pack('B*', '1000000111111110');
However, if the bytes are constant, it's probably better to convert them to hex values and use the \x escape with a string literal:
print FILE "\x81\xFE";
How about
# open my $fh, ...
print $fh "\x81\xFE"; # 10000001 and 11111110
Since version 5.6.0 (released in March 2000), perl has supported binary literals as documented in perldata:
Numeric literals are specified in any of the following floating point or integer formats:
12345
12345.67
.23E-10 # a very small number
3.14_15_92 # a very important number
4_294_967_296 # underscore for legibility
0xff # hex
0xdead_beef # more hex
0377 # octal (only numbers, begins with 0)
0b011011 # binary
You are allowed to use underscores (underbars) in numeric literals between digits for legibility. You could, for example, group binary digits by threes (as for a Unix-style mode argument such as 0b110_100_100) or by fours (to represent nibbles, as in 0b1010_0110) or in other groups.
You may be tempted to write
print $fh 0b10000001, 0b11111110;
but the output would be
129254
because 10000001₂ = 129₁₀ and 11111110₂ = 254₁₀.
You want a specific representation of the literals’ values, namely as two unsigned bytes. For that, use pack with a template of "C2", i.e., octet times two. Adding underscores for readability and wrapping it in a convenient subroutine gives
sub write_marker {
my($fh) = #_;
print $fh pack "C2", 0b1000_0001, 0b1111_1110;
}
As a quick demo, consider
binmode STDOUT or die "$0: binmode: $!\n"; # we'll send binary data
write_marker *STDOUT;
When run as
$ ./marker-demo | od -t x1
the output is
0000000 81 fe
0000002
In case it’s unfamiliar, the od utility is used here for presentational purposes because the output contains a control character and Þ (Latin small thorn) in my system’s encoding.
The invocation above commands od to render in hexadecimal each byte from its input, which is the output of marker-demo. Note that 10000001₂ = 81₁₆ and 11111110₂ = FE₁₆. The numbers in the left-hand column are offsets: the special marker bytes start at offset zero (that is, immediately), and there are exactly two of them.

Difference between "52" and 52?

Guys perl is not as easy i thought its so confusing thing.I just moved to operators and I wrote some codes but I am unable to figure it out how the compiler treating them.
$in = "42" ;
$out = "56"+32+"good";
print $out,;
The output for above code is 88 and where does the good gone? and Now lets see the other one.
$in ="42";
$out="good52"+32;
print $out ;
and for these the output is 32. The question is where does the good gone that we just stored in $out and the value 52 between the " "why the compiler just printing the value as 32 but not that remaining text.And the other question is
$in=52;
$in="52";
both doing the same work "52" not working as a text . becuase when we add "52"+32 it gives as 84. what is happening and
$in = "hello";
$in = hello;
both do the same work ? or do they differ but if i print then give the same output.Its just eating up my brain.Its so confusing becuase when "52" or 52 and "hello" or hello doing the same job why did they introduce " ".I just need the explaination why its happening for above codes.
In Perl, + is a numeric operator. It tries to interpret its two operands as numbers. 51 is the number 51. "51" is a string containing two digits, and the + operator tries to convert the string to a number, which is 51, and uses it in the calculation. "hello" is a string containing five letters, and when the + operator tries to interpret that as a number, it equates to 0 (zero).
Your first example is thus:
$out = "56"+32+"good";
which is evaluates just like:
$out = 56 + 32 + 0;
Your print then converts that to a string on output, and yields 88.
In perl, the + operator will treat its arguments as numbers, and try to convert anything that is not a number to a number. The . (dot) operator is used to join strings: it will try to convert its operands to strings if they aren't already strings.
If you put:
use strict;
use warnings;
At the top of your script, you would get warnings such as:
Argument "good" isn't numeric in addition (+) at ...
Argument "good52" isn't numeric in addition (+) at ...
Perl automatically reassigns a string value to numeric, if possible. So "42" + 10 actually becomes 52. But it cannot do that with a proper string value, such as "good".
In perl, a string in a numerical context (like when you use a + operator) is converted to a number.
In perl, you can concatenate string using the . (dot) operator, not +.
If you use +, perl will try and interpret all of the operands as numbers. This works well for strings that are number representations, otherwise you get 0. This explains what you see.
$in=52;
$in="52";
both doing the same work "52" not working as a text . becuase when we add "52"+32 it gives as 84.
The problem here is not with the variable definition. One is a string and the other a number. But when you use the string in a numerical expression (+), then it will converted to number.
About your second question:
$in = "hello" defines a string, as you expect;
$in = hello; will just copy the symbol hello (however it is defined) on to your variable. This is actually not "strict" perl and if you set use strict; in your file, perl will complain about it.
First off, give this a read.
Your problem is that the + is a mathematical addition, which doesn't work on strings. If you use that, Perl will assume that you're working with numbers and therefore discard anything that isn't.
To concatenate strings, use .:
$str = "blah " . "blah " . "blah";
As far as the difference between "52" and 52 goes, there isn't one. Since nothing (commands, comments, etc.) in Perl can start with numbers, the compiler doesn't need the quotes to know what to do.

What's the simplest way of adding one to a binary string in Perl?

I have a variable that contains a 4 byte, network-order IPv4 address (this was created using pack and the integer representation). I have another variable, also a 4 byte network-order, subnet. I'm trying to add them together and add one to get the first IP in the subnet.
To get the ASCII representation, I can do inet_ntoa($ip&$netmask) to get the base address, but it's an error to do inet_ntoa((($ip&$netmask)+1); I get a message like:
Argument "\n\r&\0" isn't numeric in addition (+) at test.pm line 95.
So what's happening, the best as I can tell, is it's looking at the 4 bytes, and seeing that the 4 bytes don't represent a numeric string, and then refusing to add 1.
Another way of putting it: What I want it to do is add 1 to the least significant byte, which I know is the 4th byte? That is, I want to take the string \n\r&\0 and end up with the string \n\r&\1. What's the simplest way of doing that?
Is there a way to do this without having to unpack and re-pack the variable?
What's happening is that you make a byte string with $ip&$netmask, and then try to treat it as a number. This is not going to work, as such. What you have to feed to inet_ntoa is.
pack("N", unpack("N", $ip&$netmask) + 1)
I don't think there is a simpler way to do it.
Confusing integers and strings. Perhaps the following code will help:
use Socket;
$ip = pack("C4", 192,168,250,66); # why not inet_aton("192.168.250.66")
$netmask = pack("C4", 255,255,255,0);
$ipi = unpack("N", $ip);
$netmaski = unpack("N", $netmask);
$ip1 = pack("N", ($ipi&$netmaski)+1);
print inet_ntoa($ip1), "\n";
Which outputs:
192.168.250.1