I have a string of ASCII characters. I convert this to hex string using the unpack function.
#! /usr/bin/perl
use strict;
use warnings;
my $str="hello";
my $value=unpack("H*",$str);
print $value,"\n";
**output:** 68656c6c6f
Now, lets say, I want to use this output as a string of hex bytes, read one byte at a time and perform some computation on it and store the output in another variable.
For instance,
#! /usr/bin/perl
use strict;
use warnings;
my $str="hello";
my $value=unpack("H*",$str);
my $num=0x12;
my $i=0;
while($i<length($value))
{
my $result.=(substr($value,$i,2)^$num);
$i+=2;
}
print $result,"\n";
**output:**
Argument "6c" isn't numeric in bitwise xor (^) at test.pl line 13.
Argument "6c" isn't numeric in bitwise xor (^) at test.pl line 13.
Argument "6f" isn't numeric in bitwise xor (^) at test.pl line 13.
8683202020
The output is incorrect and also there are several warnings.
If we take the first hex byte of the string, "hello" as an example:
68 xor 12 = 7A
However, the output shows it as 86. The output is incorrect and also I am not sure how
it got an output of 86.
What is the right way to do it?
If something is in hex, it is necessarily a string, since hex is a human-readable representation of a number. You don't want a string; you want a series of numbers, where each of those numbers is the numerical value of the char. You could use ord to get the number character by character, but unpack also provides the means:
my #bytes = unpack 'C*', $str;
Do the processing you want:
$_ ^= $num for #bytes;
And reconstitute the string:
$str = pack 'C*', #bytes;
The above three combined:
$str = pack 'C*', map $_ ^ $num, unpack 'C*', $str;
You can also do it as follows:
my $mask = chr($num) x length($str);
$str ^= $mask;
Related
I want to escape a string, per RFC 4515. So, the string "u1" would be transformed to "\75\31", that is, the ordinal value of each character, in hex, preceded by backslash.
Has to be done in Perl. I already know how to do it in Python, C++, Java, etc., but Perl if baffling.
Also, I cannot use Net::LDAP and I may not be able to add any new modules, so, I want to do it with basic Perl features.
Skimming through RFC 4515, this encoding escapes the individual octets of multi-byte UTF-8 characters, not codepoints. So, something that works with non-ASCII text too:
#!/usr/bin/env perl
use strict;
use warnings;
use feature qw/say/;
sub valueencode ($) {
# Unpack format returns octets of UTF-8 encoded text
my #bytes = unpack "U0C*", $_[0];
sprintf '\%02x' x #bytes, #bytes;
}
say valueencode 'u1';
say valueencode "Lu\N{U+010D}i\N{U+0107}"; # Lučić, from the RFC 4515 examples
Example:
$ perl demo.pl
\75\31
\4c\75\c4\8d\69\c4\87
Or an alternative using the vector flag:
use Encode qw/encode/;
sub valueencode ($) {
sprintf '\%*vx', "\\", encode('UTF-8', $_[0]);
}
Finally, a smarter version that only escapes ASCII characters when it has to (And multi-byte characters, even though upon a closer read of the RFC they don't actually need to be if they're valid UTF-8):
# Encode according to RFC 4515 valueencoding grammar rules:
#
# Text is UTF-8 encoded. Bytes can be escaped with the sequence
# \XX, where the X's are hex digits.
#
# The characters NUL, LPAREN, RPAREN, ASTERISK and BACKSLASH all MUST
# be escaped.
#
# Bytes > 0x7F that aren't part of a valid UTF-8 sequence MUST be
# escaped. This version assumes there are no such bytes and that input
# is a ASCII or Unicode string.
#
# Single bytes and valid multibyte UTF-8 sequences CAN be escaped,
# with each byte escaped separately. This version escapes multibyte
# sequences, to give ASCII results.
sub valueencode ($) {
my $encoded = "";
for my $byte (unpack 'U0C*', $_[0]) {
if (($byte >= 0x01 && $byte <= 0x27) ||
($byte >= 0x2B && $byte <= 0x5B) ||
($byte >= 0x5D && $byte <= 0x7F)) {
$encoded .= chr $byte;
} else {
$encoded .= sprintf '\%02x', $byte;
}
}
return $encoded;
}
This version returns the strings 'u1' and 'Lu\c4\8di\c4\87' from the above examples.
In short, one way is just as the question says: split the string into characters, get their ordinals then convert format to hex; then put it back together. I don't know how to get the \nn format so I'd make it 'by hand'. For instance
my $s = join '', map { sprintf '\%x', ord } split //, 'u1';
Or use vector flag %v to treat the string as a "vector" of integers
my $s = sprintf '\%*vx', '\\', 'u1';
With %v the string is broken up into numerical representation of characters, each is converted (%x), and they're joined back, with . between them. That (optional) * allows us to specify our string by which to join them instead, \ (escaped) here.
This can also be done with pack + unpack, see the link below. Also see that page if there is a wide range of input characters.†
See ord and sprintf, and for more pages like this one.
† If there is non-ASCII input then you may need to encode it so to get octets, if they are to escape (and not whole codepoints)
use Encode qw(encode);
my $s = sprintf '\%*vx', '\\', encode('UTF_8', $input);
See the linked page for more.
I am new to Perl and I have difficulties using the different types.
I am trying to get an hexadecimal register, transform it to binary, use it a string and get substrings from the binary string.
I have done a few searches and what I tried is :
my $hex = 0xFA1F;
print "$hex\n";
result was "64031" . First surprise : can't I print the hex value in Perl and not just the decimal value ?
$hex = hex($hex);
print "$hex\n";
Result was 409649. Second surprise : I would expect the result to be also 64031 since "hex" converts hexadecimal to decimal.
my $bin = printf("%b", $hex);
It prints the binary value. Is there a way to transform the hex to bin without printing it ?
Thanks,
SLP
Decimal, binary, and hexadecimal are all text representations of a number (i.e. ways of writing a number). Computers can't deal with these as numbers.
my $num = 0xFA1F; stores the specified number (sixty-four thousand and thirty-one) into $num. It's stored in a format the hardware understands, but that's not very important. What's important is that it's stored as a number, not text.
When print is asked to print a number, it prints it out in decimal (or scientific notation if large/small enough). It has no idea how the number of created (from a hex constant? from addition? etc), so it can't determine how to output the number based on that.
To print an number as hex, you can use
my $hex = 'FA1F'; # $hex contains the hex representation of the number.
print $hex; # Prints the hex representation of the number.
or
my $num = 0xFA1F; # $num contains the number.
printf "%X", $num; # Prints the hex representation of the number.
You are assigning a integer value using hexadecimal format. print by default prints numbers in decimal format, so you are getting 64031.
You can verify this using the printf() by giving different formats.
$ perl -e ' my $num = 0xFA1F; printf("%d %X %b\n", ($num) x 3 ) '
64031 FA1F 1111101000011111
$ perl -e ' my $num = 64031; printf("%d %X %b\n", ($num) x 3 ) '
64031 FA1F 1111101000011111
$ perl -e ' my $num = 0b1111101000011111; printf("%d %X %b\n", ($num) x 3 ) '
64031 FA1F 1111101000011111
$
To get the binary format of 0xFA1F in string, you can use sprintf()
$ perl -e ' my $hex = 0xFA1F; my $bin=sprintf("%b",$hex) ; print "$bin\n" '
1111101000011111
$
lets take each bit of confusion in order
my $hex = 0xFA1F;
This stores a hex constant in $hex, but Perl doesn't have a hex data type so although you can write hex constants, and binary and octal constants for that matter, Perl converts them all to decimal. Note that there is a big difference between
my $hex = 0xFA1F;
and
my $hex = '0xFA1F';
The first stores a number into $hex, which when you print it out you get a decimal number, the second stores a string which when printed out will give 0xFAF1 but can be passed to the hex() function to be converted to decimal.
$hex = hex($hex);
The hex function converts a string as if it was a hex number and returns the decimal value and, as up to this point, $hex has only ever been used as a number Perl will first stringify $hex then pass the string to the hex() function to convert that value from hex to decimal.
So to the solution. You are almost there with printf(),there is a function called sprintf() which takes the same parameters as printf() but instead of printing the formatted value returns it as a string. So what you need is.
my $hex = 0xFA1F;
my $bin = sprintf("%b", $hex);
print $bin;
Technical note:
Yes I know that Perl stores all its numbers internally as binary, but lets not go there for this answer, OK?
If you're ok with using a distribution, I wrote Bit::Manip to make my prototyping a bit easier when dealing with registers (There's also a Pure Perl version available if you have problems compiling the XS code).
Not only can it fetch out bits from a number, it can toggle, clear, set etc:
use warnings;
use strict;
use Bit::Manip qw(:all);
my $register = 0xFA1F;
# fetch the bits from register using msb, lsb
my $msbyte = bit_get($register, 15, 8);
print "value: $msbyte\n";
print "bin: " . bit_bin($msbyte) . "\n";
# or simply:
# printf "bin: %b\n", $msbyte;
Output:
value: 250
bin: 11111010
Here's a blog post I wrote that shows how to use some of the software's functionality with an example datasheet register.
I have to generate a text file which contains a lot of hexadecimal values. The hex values are in arithmetic progression, with a difference of 0x1000000.
The output in the file should be as:
sum(0x08000000, "text")
sum(0x09000000, "some other text")
sum(0x0A000000, "yet another text")
...
sum(0x10000000, "random something")
...
Is there any way I can run a loop to generate these values?
Thanks.
You need printf or sprintf for outputting hex.
Perl "understands" hex just fine, if you stick 0x in front of it, it's hex.
For output formatting though, it'll default to 'normal' numeric representation - so instead you want either printf or sprintf (they do the same thing, but the latter 'prints' to a string).
#!/usr/bin/env perl
use strict;
use warnings;
print 0x10,"\n";
my $value = 0x20;
$value += 0x1F;
print $value,"\n";
printf ("%X\n", $value);
The format string is either %X for upper case hex, or %x to use lowercase.
So for your example:
for ( my $i = 0x8000000; $i <= 0x10000000; $i += 0x1000000 ) {
printf( "sum 0x%x\n", $i );
}
I want print 95 ASCII symblols unchanged, but for others to print its codes.
How make it in pure perl? 'unpack' function? Any module?
print BackSlashed('test folder'); # expected test\040folder
print BackSlashed('test тестовая folder');
# expected test\040\321\202\320\265\321\201\321\202\320\276\320\262\320\260\321\217\040folder
print BackSlashed('НОВАЯ ПАПКА');
# expected \320\235\320\236\320\222\320\220\320\257\040\320\237\320\220\320\237\320\232\320\220
sub BackSlashed() {
my $str = shift;
.. backslashed code here...
return $str
}
You can use a regular expression substitution with an evaled substitution part. In there, need to convert each character to its numeric value first, and then output it in octal notation. There's a good explanation for it in this answer. Attach an escaped backslash \ to get it to show up in the output.
$str =~ s/([^a-zA-Z0-9])/sprintf "\\%03o", ord($1)/eg;
I limited the capture group to basic ASCII letters and numbers. If you want something else, just change the character group.
Since your sample output has octets but you said your code has the use utf8 pragma, you need to convert Perl's representation of the string to the corresponding octet sequence before you run the substitution.
use utf8;
my $str = 'НОВАЯ ПАПКА';
print foo($str);
sub foo { # note that there are no () here!
my $str = shift;
utf8::encode($str);
$str =~ s/([^a-zA-Z0-9])/sprintf "\\%03o", ord($1)/eg;
return $str;
}
I'm trying to write a script to find hex strings in a text file and convert them to their reverse byte order. The trouble I'm having is that some of the hex strings are 16 bit and some are 64 bits. I've used Perl's pack to pack and unpack the 16 bit hex numbers and that works fine, but the 64 bit does not.
print unpack("H*", (pack('I!', 0x20202032))). "\n"; #This works, gives 32202020
#This does not
print unpack("H*", (pack('I!', 0x4f423230313430343239303030636334))). "\n";
I've tried the second with the q and Q (where I get ffffffffffffffff). Am I approaching this all wrong?
As bit of background, I've got a multi-gigabyte pipe-delimited text file that has hex strings in reverse byte order as explained above. Also, the columns of the file are not standard; sometimes the hex strings appear in one column, and sometimes in another. I need to convert the hex strings to their reverse byte order.
Always use warnings;. If you do, you'll get the following message:
Integer overflow in hexadecimal number at scratch.pl line 8.
Hexadecimal number > 0xffffffff non-portable at scratch.pl line 8.
These can be resolved by use bigint; and by changing your second number declaration to hex('0x4f423230313430343239303030636334').
However, that number is still too large for pack 'I' to be able to handle.
Perhaps this can be done using simple string manipulation:
use strict;
use warnings;
my #nums = qw(
0x20202032
0x4f423230313430343239303030636334
);
for (#nums) {
my $rev = join '', reverse m/([[:xdigit:]]{2})/g;
print "$_ -> 0x$rev\n"
}
__END__
Outputs:
0x20202032 -> 0x32202020
0x4f423230313430343239303030636334 -> 0x3463633030303932343034313032424f
Or to handle digits of non-even length:
my $rev = $_;
$rev =~ s{0x\K([[:xdigit:]]*)}{
my $hex = $1;
$hex = "0$hex" if length($hex) % 2;
join '', reverse $hex =~ m/(..)/g;
}e;
print "$_ -> $rev\n"
To be pedantic, the hex numbers in your example are 32-bit and 128-bit long, not 16 and 64. If the longest one was only 64-bit long, you could successfully use the Q pack template as you supposed (provided hat your perl has been compiled to support 64-bit integers).
The pack/unpack solution can be used anyway (if with the addition of a reverse - you also have to remove the leading 0x from the hex strings or trim the last two characters from the results):
print unpack "H*", reverse pack "H*", $hex_string;
Example with your values:
perl -le 'print unpack "H*", reverse pack "H*", "4f423230313430343239303030636334"'
3463633030303932343034313032424f