I'm trying to write a script to find hex strings in a text file and convert them to their reverse byte order. The trouble I'm having is that some of the hex strings are 16 bit and some are 64 bits. I've used Perl's pack to pack and unpack the 16 bit hex numbers and that works fine, but the 64 bit does not.
print unpack("H*", (pack('I!', 0x20202032))). "\n"; #This works, gives 32202020
#This does not
print unpack("H*", (pack('I!', 0x4f423230313430343239303030636334))). "\n";
I've tried the second with the q and Q (where I get ffffffffffffffff). Am I approaching this all wrong?
As bit of background, I've got a multi-gigabyte pipe-delimited text file that has hex strings in reverse byte order as explained above. Also, the columns of the file are not standard; sometimes the hex strings appear in one column, and sometimes in another. I need to convert the hex strings to their reverse byte order.
Always use warnings;. If you do, you'll get the following message:
Integer overflow in hexadecimal number at scratch.pl line 8.
Hexadecimal number > 0xffffffff non-portable at scratch.pl line 8.
These can be resolved by use bigint; and by changing your second number declaration to hex('0x4f423230313430343239303030636334').
However, that number is still too large for pack 'I' to be able to handle.
Perhaps this can be done using simple string manipulation:
use strict;
use warnings;
my #nums = qw(
0x20202032
0x4f423230313430343239303030636334
);
for (#nums) {
my $rev = join '', reverse m/([[:xdigit:]]{2})/g;
print "$_ -> 0x$rev\n"
}
__END__
Outputs:
0x20202032 -> 0x32202020
0x4f423230313430343239303030636334 -> 0x3463633030303932343034313032424f
Or to handle digits of non-even length:
my $rev = $_;
$rev =~ s{0x\K([[:xdigit:]]*)}{
my $hex = $1;
$hex = "0$hex" if length($hex) % 2;
join '', reverse $hex =~ m/(..)/g;
}e;
print "$_ -> $rev\n"
To be pedantic, the hex numbers in your example are 32-bit and 128-bit long, not 16 and 64. If the longest one was only 64-bit long, you could successfully use the Q pack template as you supposed (provided hat your perl has been compiled to support 64-bit integers).
The pack/unpack solution can be used anyway (if with the addition of a reverse - you also have to remove the leading 0x from the hex strings or trim the last two characters from the results):
print unpack "H*", reverse pack "H*", $hex_string;
Example with your values:
perl -le 'print unpack "H*", reverse pack "H*", "4f423230313430343239303030636334"'
3463633030303932343034313032424f
Related
I am new to Perl and I have difficulties using the different types.
I am trying to get an hexadecimal register, transform it to binary, use it a string and get substrings from the binary string.
I have done a few searches and what I tried is :
my $hex = 0xFA1F;
print "$hex\n";
result was "64031" . First surprise : can't I print the hex value in Perl and not just the decimal value ?
$hex = hex($hex);
print "$hex\n";
Result was 409649. Second surprise : I would expect the result to be also 64031 since "hex" converts hexadecimal to decimal.
my $bin = printf("%b", $hex);
It prints the binary value. Is there a way to transform the hex to bin without printing it ?
Thanks,
SLP
Decimal, binary, and hexadecimal are all text representations of a number (i.e. ways of writing a number). Computers can't deal with these as numbers.
my $num = 0xFA1F; stores the specified number (sixty-four thousand and thirty-one) into $num. It's stored in a format the hardware understands, but that's not very important. What's important is that it's stored as a number, not text.
When print is asked to print a number, it prints it out in decimal (or scientific notation if large/small enough). It has no idea how the number of created (from a hex constant? from addition? etc), so it can't determine how to output the number based on that.
To print an number as hex, you can use
my $hex = 'FA1F'; # $hex contains the hex representation of the number.
print $hex; # Prints the hex representation of the number.
or
my $num = 0xFA1F; # $num contains the number.
printf "%X", $num; # Prints the hex representation of the number.
You are assigning a integer value using hexadecimal format. print by default prints numbers in decimal format, so you are getting 64031.
You can verify this using the printf() by giving different formats.
$ perl -e ' my $num = 0xFA1F; printf("%d %X %b\n", ($num) x 3 ) '
64031 FA1F 1111101000011111
$ perl -e ' my $num = 64031; printf("%d %X %b\n", ($num) x 3 ) '
64031 FA1F 1111101000011111
$ perl -e ' my $num = 0b1111101000011111; printf("%d %X %b\n", ($num) x 3 ) '
64031 FA1F 1111101000011111
$
To get the binary format of 0xFA1F in string, you can use sprintf()
$ perl -e ' my $hex = 0xFA1F; my $bin=sprintf("%b",$hex) ; print "$bin\n" '
1111101000011111
$
lets take each bit of confusion in order
my $hex = 0xFA1F;
This stores a hex constant in $hex, but Perl doesn't have a hex data type so although you can write hex constants, and binary and octal constants for that matter, Perl converts them all to decimal. Note that there is a big difference between
my $hex = 0xFA1F;
and
my $hex = '0xFA1F';
The first stores a number into $hex, which when you print it out you get a decimal number, the second stores a string which when printed out will give 0xFAF1 but can be passed to the hex() function to be converted to decimal.
$hex = hex($hex);
The hex function converts a string as if it was a hex number and returns the decimal value and, as up to this point, $hex has only ever been used as a number Perl will first stringify $hex then pass the string to the hex() function to convert that value from hex to decimal.
So to the solution. You are almost there with printf(),there is a function called sprintf() which takes the same parameters as printf() but instead of printing the formatted value returns it as a string. So what you need is.
my $hex = 0xFA1F;
my $bin = sprintf("%b", $hex);
print $bin;
Technical note:
Yes I know that Perl stores all its numbers internally as binary, but lets not go there for this answer, OK?
If you're ok with using a distribution, I wrote Bit::Manip to make my prototyping a bit easier when dealing with registers (There's also a Pure Perl version available if you have problems compiling the XS code).
Not only can it fetch out bits from a number, it can toggle, clear, set etc:
use warnings;
use strict;
use Bit::Manip qw(:all);
my $register = 0xFA1F;
# fetch the bits from register using msb, lsb
my $msbyte = bit_get($register, 15, 8);
print "value: $msbyte\n";
print "bin: " . bit_bin($msbyte) . "\n";
# or simply:
# printf "bin: %b\n", $msbyte;
Output:
value: 250
bin: 11111010
Here's a blog post I wrote that shows how to use some of the software's functionality with an example datasheet register.
I have an array of hex numbers that I'd like to convert to binary numbers, the problem is, in my code it removes the leading 0's for things like 0,1,2,3. I need these leading 0's to process in a future section of my code. Is there an easy way to convert Hex to Binary and keep my leading 0's in perl?
use strict;
use warnings;
my #binary;
my #hex = ('ABCD', '0132', '2211');
foreach my $h(#hex){
my $bin = sprintf( "%b", hex($h));
push #binary, $bin;
}
foreach (#binary){
print "$_\n";
}
running the code gives me
1010101111001101
100110010
10001000010001
Edit: Found a similar answer using pack and unpack, replaced
sprint( "%b", hex($h));
with
unpack( 'B*', pack('H*' ($h))
You can specify the width of the output in sprintf or printf by putting the number between the % and the format character like this.
printf "%16b\n",hex("0132");
and by preceding the number with 0, make it pad the result with 0s like this
printf "%016b\n",hex("0132");
the latter giving the result of
0000000100110010
But this is all covered in the documentation for those functions.
This solution uses the length of the hex repesentation to determine the length of the binary representation:
for my $num_hex (#nums_hex) {
my $num = hex($num_hex);
my $num_bin = sprintf('%0*b', length($num_hex)*4, $num);
...
}
My script generates some very very huge files, and I am trying to print/save the output in a binary format to reduce the file size as much as possible!
Each time that script generates five values, like:
$a1 = 1.64729
$a2 = 4.33329
$a3 = 3.55724
$a4 = 1.45759
$a5 = 7.474700
It prints in the output like:
A:1.64729,4.33329,3.55724,1.45759,7.474700
I am not sure whether this is the best way, but I thought to pack each row when it is printing to the output! I used pack/unpack built-in function in Perl!
I had a look at perldoc, but I did not understand which format specifiers were proper (???)!
#!/usr/bin/perl
...
#A = ($a1,$a2,$a3,$a4,$a5);
print pack ("???", ("A:", join(",", map { sprintf "%.1f", $_ } #A)), "\n";
If you compress the file (instead of trying to write binary bytes) you will get a small file. That's because your entire file will have mostly the ten digit characters, plus a decimal point, and a comma.
You can compress a file as you write it via IO::Zlib. This will use either the Zlib library, or the gzip command.
However, if you want to use pack, go ahead. Get the Camel Book which gives much clearer documentation than the standard Perldoc.
It's not all that difficult:
my $output = "A:1.64729,4.33329,3.55724,1.45759,7.474700";
$output =~ s/^A://; #Remove the 'A:'
my #numbers = split /,/, $output # Make into an array
my $packed = pack "d5", #numbers; # Pack five inputs as floating point numbers
say join ",", "d5", $packed; # Unpacks those five decimal encoded numbers
You'll probably have to use syswrite and sysread since aren't reading and writing strings. This is unbuffered reading and writing, and you have to specify the number of bytes you're reading or writing.
One more thing: If you know where the decimal point is in the number (that is, it's always a number between 1 and up to 10) you can convert the number into an integer which will allow you to pack the number into an even smaller number of bytes:
my $output = "A:1.64729,4.33329,3.55724,1.45759,7.474700";
$output =~ s/^A://; #Remove the 'A:'
$output =~ s/,//g; #Remove all the decimal points
my #numbers = split /,/, $output # Make into an array
my $packed = pack "L5", #numbers; # Pack five inputs as unsigned long numbers
I have a string of ASCII characters. I convert this to hex string using the unpack function.
#! /usr/bin/perl
use strict;
use warnings;
my $str="hello";
my $value=unpack("H*",$str);
print $value,"\n";
**output:** 68656c6c6f
Now, lets say, I want to use this output as a string of hex bytes, read one byte at a time and perform some computation on it and store the output in another variable.
For instance,
#! /usr/bin/perl
use strict;
use warnings;
my $str="hello";
my $value=unpack("H*",$str);
my $num=0x12;
my $i=0;
while($i<length($value))
{
my $result.=(substr($value,$i,2)^$num);
$i+=2;
}
print $result,"\n";
**output:**
Argument "6c" isn't numeric in bitwise xor (^) at test.pl line 13.
Argument "6c" isn't numeric in bitwise xor (^) at test.pl line 13.
Argument "6f" isn't numeric in bitwise xor (^) at test.pl line 13.
8683202020
The output is incorrect and also there are several warnings.
If we take the first hex byte of the string, "hello" as an example:
68 xor 12 = 7A
However, the output shows it as 86. The output is incorrect and also I am not sure how
it got an output of 86.
What is the right way to do it?
If something is in hex, it is necessarily a string, since hex is a human-readable representation of a number. You don't want a string; you want a series of numbers, where each of those numbers is the numerical value of the char. You could use ord to get the number character by character, but unpack also provides the means:
my #bytes = unpack 'C*', $str;
Do the processing you want:
$_ ^= $num for #bytes;
And reconstitute the string:
$str = pack 'C*', #bytes;
The above three combined:
$str = pack 'C*', map $_ ^ $num, unpack 'C*', $str;
You can also do it as follows:
my $mask = chr($num) x length($str);
$str ^= $mask;
I have a problem understanding and using the 'vec' keyword.
I am reading a logpacket in which values are stored in little endian hexadecimal. In my code, I have to unpack the different bytes into scalars using the unpack keyword.
Here's an example of my problem:
my #hexData1 = qw(50 65);
my $data = pack ('C*', #hexData1);
my $x = unpack("H4",$data); # At which point the hexadecimal number became a number
print $x."\n";
#my $foo = sprintf("%x", $foo);
print "$_-> " . vec("\x65\x50", $_, 1) . ", " for (0..15); # This works.
print "\n";
But I want to use the above statement in the way below. I don't want to send a string of hexadecimal in quotes. I want to use the scalar array of hex $x. But it won't work. How do I convert my $x to a hexadecimal string. This is my requirement.
print "$_-> " . vec($x, $_, 1).", " for (0..15); # This doesn't work.
print "\n";
My final objective is to read the third bit from the right of the two byte hexadecimal number.
How do I use the 'vec' command for that?
You are making the mistake of unpacking $data into $x before using it in a call to vec. vec expects a string, so if you supply a number it will be converted to a string before being used. Here's your code
my #hexData1 = qw(50 65);
my $data= pack ('C*', #hexData1);
The C pack format uses each value in the source list as a character code. It is the same as calling chr on each value and concatenating them. Unfortunately your values look like decimal, so you are getting chr(50).chr(65) or "2A". Since your values are little-endian, what you want is chr(0x65).chr(0x50) or "\x65\x50", so you must write
my $data= pack ('(H2)*', reverse #hexData1);
which reverses the list of data (to account for it being little-endian) and packs it as if it was a list of two-digit hex strings (which, fortunately, it is).
Now you have done enough. As I say, vec expects a string so you can write
print join ' ', map vec($data, $_, 1), 0 .. 15;
print "\n";
and it will show you the bits you expect. To extract the the 3rd bit from the right (assuming you mean bit 13, where the last bit is bit 15) you want
print vec $data, 13, 1;
First, get the number the bytes represent.
If you start with "\x50\x65",
my $num = unpack('v', "\x50\x65");
If you start with "5065",
my $num = unpack('v', pack('H*', "5065"));
If you start with "50","65",
my $num = unpack('v', pack('H*', join('', "50","65"));
Then, extract the bit you want.
If you want bit 10,
my $bit = ($num >> 10) & 1;
If you want bit 2,
my $bit = ($num >> 2) & 1;
(I'm listing a few possibilities because it's not clear to me what you want.)