Efficient pre-perl-5.10 equivalent of pack("Q>") - perl

Update: Salva correctly points out that I was wrong about the introduction of the "Q" pack template. It's the ">" modifier that doesn't go back to 5.8.
Perl 5.10 introduced the pack() modifier ">", which, for my use case with "Q" packs an unsigned quad (64bit) value in big endian.
Now, I'm looking for an efficient equivalent for
pack("Q>2", #ints)
where #ints contains two 64bit unsigned ints. "Q>2" means "pack two unsigned quads in big-endian byte order". Obviously, I want this because I am (at least temporarily) tied to a pre-5.10 Perl.
Update2: Actually, on further reflection, something as simple as the following should do:
pack("N4", $ints[0] >> 32, $ints[0], $ints[1] >> 32, $ints[1])
Appears to work on my 64bit x86-64 Linux. Any reason why this might not be exactly the same as pack("Q>2", #ints)? Any platform-specific matters?
What's the reverse (ie. equivalent to unpack("Q>2", #ints))?

The Q pattern was introduced in perl 5.6. Your real problem may be that you are trying to use it in a perl compiled without 64bit support.
Anyway, you can use Math::Int64.
Update, an example:
use Math::Int64 qw(int64_to_native);
my $packed = join '', map int64_to_native($_), #ints;
Another option, if you are on a 64bit perl supporting Q but not Q>, is to reorder the bytes yourself:
pack 'C*', reverse unpack 'C*', pack 'Q', $int;

Related

Perl: How to make 64 bit perl compiler requirement mandatory to run the program

I'm interested to know if there is a way to force a program to execute only in a x64 bit perl compiler. If the program runs in a 32 bit compiler, it should throw an error and exit.
Something similar to require 5.10.0.
I have a program that has a lot of 64 bit integer processing to do. All of them are in string format and hex "0xXXXXXXXXXXXXXXXX" does not get processed by a 32 bit compiler (Heard somewhere only upto 53bits are supported). I do know that we can use Math::BigInt, but I'm looking to remove use of libraries since the script will be running in other systems that may not have this library.
Despite all the talk about compilers, it sounds like you actually want to check that Perl's integers are (at least) 64 bits in size. For that, you could use the following:
use Config qw( %Config );
BEGIN { die("64-bit ints required.\n") if $Config{ivsize} < 8; }
or
BEGIN { die("64-bit ints required.\n") if length(pack('j', 0)) < 8; }
or
BEGIN { die("64-bit ints required.\n") if ~0 <= 0xFFFF_FFFF; }
I placed the check in a BEGIN block so you don't have any problems if you have large constants in your program.

Perl 5.6.1 vs. Perl 5.14 - converting dec to hex

I found something strange.
Different behaviors for different versions of perl.
The code is:
$x = -806;
$x = sprintf "0x%x" , $x;
print "$x";
In 5.6.1 i get:
0xfffffcda
In 5.14 i get:
0xfffffffffffffcda
How can i get 32-bit in 5.14 as well?
Thanks!
The thing with negative numbers is they're represented via 2s complement binary. What you're seeing is the result of the word size being larger.
I'm not entirely sure precisely why it would have changed (aside from 14 years and a general move to 64bit), but it's not easy to fix without recompiling perl. I'd suggest that's not a good idea since what you're really trying to get is a stringification.
A simpler solution would be a bitwise AND with the appropriate length bitmask:
$x = -806;
$x = sprintf ("0x%x" , $x & 0xffffffff);
print "$x";
Some addition to the answer above:
The number of digits Perl produces when its sprintf converts to hex depends on the size of the native C data type Perl uses internally to store unsigned integer values. What type that is is determined by Perl's Configure script when it sets things up to compile the Perl interpreter, so it's not exactly something that can be changed at run time. It can also vary from operating system to operating system and machine to machine, so if you run your script in different environments you can't be sure how many hex digits will be produced (a point strongly in favor of Sobrique's suggestion). It's also quite likely that the default native type was changed from a 32-bit one to a 64-bit one at some point during the 14 years since 5.6.1 was released.
If you want to know what type is used in a particular perl installation, perl -MConfig -E 'say $Config{uvtype}' will tell you (modify as needed for pre-5.10 perls).

How to tokenize Perl source code?

I have some reasonable (not obfuscated) Perl source files, and I need a tokenizer, which will split it to tokens, and return the token type of each of them, e.g. for the script
print "Hello, World!\n";
it would return something like this:
keyword 5 bytes
whitespace 1 byte
double-quoted-string 17 bytes
semicolon 1 byte
whitespace 1 byte
Which is the best library (preferably written in Perl) for this? It has to be reasonably correct, i.e. it should be able to parse syntactic constructs like qq{{\}}}, but it doesn't have to know about special parsers like Lingua::Romana::Perligata. I know that parsing Perl is Turing-complete, and only Perl itself can do it right, but I don't need absolute correctness: the tokenizer can fail or be incompatible or assume some default in some very rare corner cases, but it should work correctly most of the time. It must be better than the syntax highlighting built into an average text editor.
FYI I tried the PerlLexer in pygments, which works reasonable for most constructs, except that it cannot find the 2nd print keyword in this one:
print length(<<"END"); print "\n";
String
END
PPI
use PPI;
Yes, only perl can parse Perl, however PPI is the 95% correct solution.

Finding if the system is little endian or big endian with perl

Is there an option to find if my system is little endian byte order or big endian byte order using Perl?
perl -MConfig -e 'print "$Config{byteorder}\n";'
See Perl documentation.
If the first byte of the output string is 1, you can assume (with moderate safety) that it is little-endian. If it is 4 or 8, you can assume big-endian.
I guess you could do:
$big_endian = pack("L", 1) eq pack("N", 1);
This might fail if your system has a nonstandard (neither big-endian nor little-endian) byte ordering (eg PDP-11).

How can I sprintf a big number in Perl?

On a Windows 32-bit platform I have to read some numbers that, this was unexpected, can have values as big as 99,999,999,999, but no more. Trying to sprintf("%011d", $myNum) them outputs an overflow: -2147483648.
I cannot use the BigInt module because in this case I should deeply change the code. I cannot manage the format as string, sprintf("%011s", $numero), because the minus sign is incorrectly handled.
How can I manage this? Could pack/unpack be of some help?
Try formatting it as a float with no fraction part:
$ perl -v
This is perl, v5.6.1 built for sun4-solaris
...
$ perl -e 'printf "%011d\n", 99999999999'
-0000000001
$ perl -e 'printf "%011.0f\n", 99999999999'
99999999999
Yes, one of Perl's numeric blind spots is formatting; Perl automatically handles representing numbers as integers or floats pretty well, but then coerces them into
one or the other when the printf numeric formats are used, even when that isn't
appropriate. And printf doesn't really handle BigInts at all (except by treating
them as strings and converting that to a number, with loss of precision).
Using %s instead of %d with any number you aren't sure will be in an appropriate
range is a good workaround, except as you note for negative numbers. To handle
those, you are going to have to write some Perl code.
Floats can work, up to a point.
perl -e "printf qq{%.0f\n}, 999999999999999"
999999999999999
But only up to a point
perl -e "printf qq{%.0f\n}, 9999999999999999999999999999999999999999999999"
9999999999999998663747590131240811450955988992
Bignum doesn't help here.
perl -e "use bignum ; printf qq{%.0f\n}, 9999999999999999999999999999999999999999999999"
9999999999999999931398190359470212947659194368
The problem is printf. (Do you really need printf?)
Could print work?
perl -e "use bignum;print 9999999999999999999999999999999999999999999999"
9999999999999999999999999999999999999999999999
Having said all of that, the nice thing about perl is it's always an option to roll your own.
e.g.
my $in = ...;
my $out = "";
while($in){
my $chunk=$in & 0xf;
$in >>= 4;
$out = sprintf("%x",$chunk).$out;
}
print "0x$out\n";
I'm no Perl expert, and maybe I'm missing some sort of automatic handling of bignums here, but isn't this simply a case of integer overflow? A 32-bit integer can't hold numbers that are as big as 99,999,999,999.
Anyway, I get the same result with Perl v5.8.8 on my 32-bit Linux machine, and it seems that printf with "%d" doesn't handle larger numbers.
I think your copy of Perl must be broken, this is from CygWin's version (5.10):
pax$ perl -e 'printf("%011d\n", 99999999999);'
99999999999
pax$ perl -v
This is perl, v5.10.0 built for cygwin-thread-multi-64int
(with 6 registered patches, see perl -V for more detail)
Copyright 1987-2007, Larry Wall
Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.
Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl". If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.
What version are you running (output of perl -v)?
You may have to get a 64-bit enabled version of Perl [and possibly a new 64-bit production machine] (note the "cygwin-thread-multi-64int" in my output). That will at least avoid the need for changing the code.
I'm stating this on the basis that you don't want to change the code greatly (i.e., you fear breaking things). The solution of new hardware, whilst a little expensive, will almost certainly not require you to change the software at all. It depends on your priorities.
Another possibility is that Perl itself may be storing the number correctly but just displaying it wrong due to a printf() foible. In that case, you may want to try:
$million = 1000000;
$bignum = 99999999999;
$firstbit = int($bignum / $million);
$secondbit = $bignum - $firstbit * million;
printf ("%d%06d\n",$firstbit,$secondbit);
Put that in a function and call the function to return a string, such as:
sub big_honkin_number($) {
$million = 1_000_000;
$bignum = shift;
$firstbit = int($bignum / $million);
$secondbit = $bignum - $firstbit * $million;
return sprintf("%d%06d\n", $firstbit, $secondbit);
}
printf ("%s", big_honkin_number (99_999_999_999));
Note that I tested this but on the 64-bit platform - you'll need to do your own test on 32-bit but you can use whatever scaling factor you want (including more than two segments if need be).
Update: That big_honkin_number() trick works fine on a 32-bit Perl so it looks like it is just the printf() functions that are stuffing you up:
pax#pax-desktop:~$ perl -v
This is perl, v5.8.8 built for i486-linux-gnu-thread-multi
Copyright 1987-2006, Larry Wall
Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.
Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl". If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.
pax#pax-desktop:~$ perl qq.pl
99999999999