How to unpack (64-bit) unsigned long in 64-bit Perl? - perl

I'm trying to unpack an unsigned long value that is passed from a C program to a Perl script via SysV::IPC.
It is known that the value is correct (I made a test which sends the same value into two queues, one read by Perl and the second by the C application), and all predecessing values are read correctly (used q instead of i! to work with 64-bit integers).
It is also known that PHP had something similar in bugs (search for "unsigned long on 64 bit machines") (seems to be similar:
Pack / unpack a 64-bit int on 64-bit architecture in PHP)
Arguments tested so far:
..Q ( = some value that is larger than expected)
..L ( = 0)
..L! ( = large value)
..l ( = 0)
..l! ( = large value)
..lN! ( = 0)
..N, ..N! ( = 0)
use bigint; use bignum; -- no effect.
Details:
sizeof(unsigned long) = 8;
Data::Dumper->new([$thatstring])->Useqq(1)->Dump(); a lot of null bytes along some meaningful..
byteorder='12345678';
Solution:
- x4Q with padding four bytes.

Unpacking using Q in the template works out of the box if you have 64-bit Perl:
The TEMPLATE is a sequence of characters that give the order
and type of values, as follows:
...
q A signed quad (64-bit) value.
Q An unsigned quad value.
(Quads are available only if your system supports 64-bit
integer values _and_ if Perl has been compiled to support those.
Causes a fatal error otherwise.)
For a more robust solution, unpack the value into an 8-byte string and use the Math::Int64 module to convert it to an integer:
use Math::Int64 qw( :native_if_available int64 );
...
$string_value = unpack("A8", $longint_from_the_C_program);
# one of these two functions will work, depending on your system's endian-ness
$int_value = Math::Int64::native_to_int64($string_value);
$int_value = Math::Int64::net_to_int64($string_value);

The solution was simple: added x4Q to skip four bytes before actual value; need to more visually think of padding/alignment..

Related

Perl variable assignment side effects

I'll be the first to admit that Perl is not my strong suit. But today I ran across this bit of code:
my $scaledWidth = int($width1x * $scalingFactor);
my $scaledHeight = int($height1x * $scalingFactor);
my $scaledSrc = $Media->prependStyleCodes($src, 'SX' . $scaledWidth);
# String concatenation makes this variable into a
# string, so we need to make it an integer again.
$scaledWidth = 0 + $scaledWidth;
I could be missing something obvious here, but I don't see anything in that code that could make $scaledWidth turn into a string. Unless somehow the concatenation in the third line causes Perl to permanently change the type of $scaledWidth. That seems ... wonky.
I searched a bit for "perl assignment side effects" and similar terms, and didn't come up with anything.
Can any of you Perl gurus tell me if that commented line of code actually does anything useful? Does using an integer variable in a concatenation expression really change the type of that variable?
It is only a little bit useful.
Perl can store a scalar value as a number or a string or both, depending on what it needs.
use Devel::Peek;
Dump($x = 42);
Dump($x = "42");
Outputs:
SV = PVIV(0x139a808) at 0x178a0b8
REFCNT = 1
FLAGS = (IOK,pIOK)
IV = 42
PV = 0x178d9e0 "0"\0
CUR = 1
LEN = 16
SV = PVIV(0x139a808) at 0x178a0b8
REFCNT = 1
FLAGS = (POK,pPOK)
IV = 42
PV = 0x178d9e0 "42"\0
CUR = 2
LEN = 16
The IV and IOK tokens refer to how the value is stored as a number and whether the current integer representation is valid, while PV and POK indicate the string representation and whether it is valid. Using a numeric scalar in a string context can change the internal representation.
use Devel::Peek;
$x = 42;
Dump($x);
$y = "X" . $x;
Dump($x);
SV = IV(0x17969d0) at 0x17969e0
REFCNT = 1
FLAGS = (IOK,pIOK)
IV = 42
SV = PVIV(0x139aaa8) at 0x17969e0
REFCNT = 1
FLAGS = (IOK,POK,pIOK,pPOK)
IV = 42
PV = 0x162fc00 "42"\0
CUR = 2
LEN = 16
Perl will seamlessly convert one to the other as needed, and there is rarely a need for the Perl programmer to worry about the internal representation.
I say rarely because there are some known situations where the internal representation matters.
Perl variables are not typed. Any scalar can be either a number or a string depending how you use it. There are a few exceptions where an operation is dependent on whether a value seems more like a number or string, but most of them have been either deprecated or considered bad ideas. The big exception is when these values must be serialized to a format that explicitly stores numbers and strings differently (commonly JSON), so you need to know which it is "supposed" to be.
The internal details are that a SV (scalar value) contains any of the values that have been relevant to its usage during its lifetime. So your $scaledWidth first contains only an IV (integer value) as the result of the int function. When it is concatenated, that uses it as a string, so it generates a PV (pointer value, used for strings). That variable contains both, it is not one type or the other. So when something like JSON encoders need to determine whether it's supposed to be a number or a string, they see both in the internal state.
There have been three strategies that JSON encoders have taken to resolve this situation. Originally, JSON::PP and JSON::XS would simply consider it a string if it contains a PV, or in other words, if it's ever been used as a string; and as a number if it only has an IV or NV (double). As you alluded to, this leads to an inordinate amount of false positives.
Cpanel::JSON::XS, a fork of JSON::XS that fixes a large number of issues, along with more recent versions of JSON::PP, use a different heuristic. Essentially, a value will still be considered a number if it has a PV but the PV matches the IV or NV it contains. This, of course, still results in false positives (example: you have the string '5', and use it in a numerical operation), but in practice it is much more often what you want.
The third strategy is the most useful if you need to be sure what types you have: be explicit. You can do this by reassigning every value to explicitly be a number or string as in the code you found. This assigns a new SV to $scaledWidth that contains only an IV (the result of the addition operation), so there is no ambiguity. Another method of being explicit is using an encoding method that allows specifying the types you want, like Cpanel::JSON::XS::Type.
The details of course vary if you're not talking about the JSON format, but that is where this issue has been most deliberated. This distinction is invisible in most Perl code where the operation, not the values, determine the type.

Boolean size in Ada

In my ada's project I have 2 different libraries with base types. I found two different definition for a boolean :
Library A :
type Bool_Type is new Boolean;
Library B :
type T_BOOL8 is new Boolean;
for T_BOOL8'Size use 8;
So I have a question, what is the size used for Bool_Type ?
Bool_Type will inherit the 'Size of Boolean, which is required to be 1,
see RM 13.3(49)
Compile with switch -gnatR2 to see its representation clause. For example:
main.adb
with Ada.Text_IO; use Ada.Text_IO;
procedure Main is
type Bool_Type is new Boolean;
type T_BOOL8 is new Boolean;
for T_BOOL8'Size use 8;
begin
Put_Line ("Bool_Type'Object_Size = " & Integer'Image (Bool_Type'Object_Size));
Put_Line ("Bool_Type'Value_Size = " & Integer'Image (Bool_Type'Value_Size));
Put_Line ("Bool_Type'Size = " & Integer'Image (Bool_Type'Size));
New_Line;
Put_Line ("T_BOOL8'Object_Size = " & Integer'Image (T_BOOL8'Object_Size));
Put_Line ("T_BOOL8'Value_Size = " & Integer'Image (T_BOOL8'Value_Size));
Put_Line ("T_BOOL8'Size = " & Integer'Image (T_BOOL8'Size));
New_Line;
end Main;
compiler output (partial):
Representation information for unit Main (body)
-----------------------------------------------
for Bool_Type'Object_Size use 8;
for Bool_Type'Value_Size use 1;
for Bool_Type'Alignment use 1;
for T_Bool8'Size use 8;
for T_Bool8'Alignment use 1;
program output
Bool_Type'Object_Size = 8
Bool_Type'Value_Size = 1
Bool_Type'Size = 1
T_BOOL8'Object_Size = 8
T_BOOL8'Value_Size = 8
T_BOOL8'Size = 8
As can be seen, the number returned by the 'Size / 'Value_Size attribute for Bool_Type is equal to 1 (as required by the RM; see egilhh's answer). The attribute 'Size / 'Value_Size states the number of bits used to represent a value of the type. The 'Object_Size attribute, on the other hand, equals 8 bits (1 byte) and states the amount of bits used to store a value of the given type in memory (see Simon Wright's comment). See here and here for details.
Note that the number of bits indicated by 'Size / 'Value_Size must be sufficient to uniquely represent all possible values within the (discrete) type. For Boolean derived types, at least 1 bit is required, for an enumeration type with 3 values, for example, you need at least 2 bits.
An effect of explicitly setting the 'Size / 'Value_Size attribute can be observed when defining a packed array (as mentioned in G_Zeus’ answer):
type Bool_Array_Type is
array (Natural range 0 .. 7) of Bool_Type with Pack;
type T_BOOL8_ARRAY is
array (Natural range 0 .. 7) of T_BOOL8 with Pack;
compiler output (partial):
Representation information for unit Main (body)
-------------------------------------------------
[...]
for Bool_Array_Type'Size use 8;
for Bool_Array_Type'Alignment use 1;
for Bool_Array_Type'Component_Size use 1;
[...]
for T_Bool8_Array'Size use 64;
for T_Bool8_Array'Alignment use 1;
for T_Bool8_Array'Component_Size use 8;
Because the number of bits used to represent a value of type T_BOOL8 is forced to be 8, the size of a single component of a packed array of T_BOOL8s will also be 8, and the total size of T_BOOL8_ARRAY will be 64 bits (8 bytes). Compare this to the total length of 8 bits (1 byte) for Bool_Array_Type.
You should find your answer (or enough information to find the answer to your specific question) in the Ada wikibooks entry for 'Size attribute.
Most likely Bool_Type has a the same size as Boolean, or 1 bit for the type (meaning you can pack Bool_Type elements in an array, for example) and 8 bits for instances (rounded up to full byte).
Whatever size the compiler wants, unless you override as in library B. Probably 8 bits but on some 32 bit RISC targets, 32 bits may be faster than 8. On a tiny microcontroller, 1 bit may save space.
The other answers let you find out for the specific target you compile for.
As your booleans are separate types, you need type conversions between them, providing hooks for the compiler to handle any format or size conversion without any further ado.

Unable to pass packed array of handles to Win32::API call in Perl

I am trying to call Win32 API function from Perl using Win32::API, and pass it array of handles. The particular function is WaitForMultipleObjects and it doesn't like the way I feed parameters to it. Here's how it's defined in Perl:
# DWORD WaitForMultipleObjects(DWORD nCount, HANDLE* handles, BOOL, DWORD)
$WaitForMultipleObjects = new Win32::API::More('kernel32',
'WaitForMultipleObjects', 'NPNN', 'N');
Then there's array of handles. All of them are confirmed valid and they all work when passed individually to WaitForSingleObject.
Here's how I pack the parameters:
my #handles;
...
my $n = scalar(#handles);
my $handlePack = pack "L*", #handles; # also tried 'L1', 'L2', etc.
$rc = $WaitForMultipleObjects->Call($n, $handlePack, 0, 0xffffffff); # fails
This fails and GetLastError() reports error 6 (The handle is invalid).
However, if I pass only one handle, it works:
my $handlePack = pack "L", $handles[0];
$rc = $WaitForMultipleObjects->Call(1, $handlePack, 0, 0xffffffff); # works
Obviously Win32::API is not able to pass the array of handles correctly in the second parameter, but as far as I understand the documentation (https://metacpan.org/pod/Win32::API), that's how it should be. Or is my usage of pack() wrong ? I am on 64-bit Perl, if that matters.
The problem is 64 bits. On 64-bit Windows (and in 64-bit Perl), sizeof(HANDLE) = 8 bytes. So if the program runs in 64-bit Perl, it loads 64-bit DLLs, and you have to pack handles using Q (i.e. 64-bit integers). Using L won't work because it packs 32-bit ints. The following fixes the problem:
use Config qw( %Config );
my $ptr_size = $Config{ptrsize};
my $ptr_format =
$ptr_size == 4 ? "L" :
$ptr_size == 8 ? "Q" :
die("Unsupported pointer size $ptr_size\n");
my $handlePack = pack $ptr_format."*", #handles;
$rc = $WaitForMultipleObjects->Call($n, $handlePack, 0, 0xffffffff);
Note that even in 64-bit Perl, pack('I') can still produce 32 bits (depending on the compiler). pack('J') (Perl's internal int) is also unsuitable because while it's at least as large as a pointer, it could be larger (e.g. a 32-bit Perl built using -Duse64bitint).

How to use Perl's 'Digest' module to calculate CRC?

I need to implement CRC-32 (with a custom polynomial) in Perl. I have seen that there is a module called Digest::CRC that does it. However, when I compare the result to an online calculator, I don't get the same CRC code.
My polynomial is "101101" (bin) or "2d" (hex)
My data is "1e5"
The online calculator is https://ghsi.de/CRC/index.php?Polynom=101101&Message=1e5. The result that I get from the calculator is "1010" (bin) or "A" (hex).
This is the Perl code that I have used (found somewhere on line)
use strict;
use warnings;
use Digest::CRC;
my $string = 0x01e5;
my $ctx = Digest::CRC->new(type => "crc32", poly => 0x2D);
$ctx->add($string);
print "CRC for '$string' is 0x" . $ctx->hexdigest . "\n";
This is the output of this Perl code:
CRC for '485' is 0x9d0fec86
I'm pretty sure that the online calculator is correct.
What is wrong with my Perl code?
Your program is, as it says, calculating the CRC for the string 485 (bytes 34 38 35), which is the decimal string representation for the number 0x1E5. Meanwhile the web site is calculating the CRC for the bytes 01 e5. I can't tell which one, if either, you want.
What is definitely true is that the web site isn't calculating any sort of CRC32, because its results aren't 32-bits long and seem to depend on the size of the polynomial you specify.
Also, if you use Digest::CRC specifying type => 'crc32' it will ignore all the other parameters and simply calculate a standard CRC32.
If you want a 32-bit CRC with a polynomial of 0x2D then you can try
my $ctx = Digest::CRC->new(width => 32, poly => 0x2D);
but there are several other things you need to define to specify a CRC, including (but not limited to) bit and byte order, initial value and end xor value, and there is no way of telling whether this will give you the correct checksum without seeing the full specification.
Surely you have a document that says something more than "CRC32, polynomial 0x2d"?
Update
How can I use the Digest::CRC to treat the data as hex bytes and not as a string?
Digest::CRC only processes strings and you need to pack your data that way. In this case you probably want my $string = "\x01\xe5"
In addition, what is the "end xor value"?
The end xor value is simply a bit pattern that is XORed with the result as the last step to get the final CRC.
In addition If I understood you correctly, the following 2 methods should give the same result:
my $ctx1 = Digest::CRC->new(type => "crc32");
my $rr1 = $ctx1->add(pack 'H*', '1e5')->hexdigest;
print "a1=$rr1=\n";
my $ctx2 = Digest::CRC->new(width => 32, poly => 0x04c11db7);
my $rr2 = $ctx2->add(pack 'H*', '1e5')->hexdigest;
print "a2=$rr2=\n";
However I get different results:
a1=fef37cd4= a2=758cce0=
Can you tell me where is my mistake?
As I said, there are many specifiers for a CRC. That is why you must establish the full specification of the CRC that you need, including more than just the width and the polynomial. To explicitly produce a CRC32 checksum you would need this
my $ctx = Digest::CRC->new(width => 32, poly => 0x04c11db7, init => 0xFFFFFFFF, xorout => 0xFFFFFFFF, refin => 1, refout => 1);
This applies initial and final values of 0xFFFFFFFF and sets refin and refout to true. This reverses the bit order (ref is short for reflect) both before and after processing, and is the difference between MSB first and LSB first.

Objective-C : Fowler–Noll–Vo (FNV) Hash implementation

I have a HTTP connector in my iPhone project and queries must have a parameter set from the username using the Fowler–Noll–Vo (FNV) Hash.
I have a Java implementation working at this time, this is the code :
long fnv_prime = 0x811C9DC5;
long hash = 0;
for(int i = 0; i < str.length(); i++)
{
hash *= fnv_prime;
hash ^= str.charAt(i);
}
Now on the iPhone side, I did this :
int64_t fnv_prime = 0x811C9DC5;
int64_T hash = 0;
for (int i=0; i < [myString length]; i++)
{
hash *= fnv_prime;
hash ^= [myString characterAtIndex:i];
}
This script doesn't give me the same result has the Java one.
In first loop, I get this :
hash = 0
hash = 100 (first letter is "d")
hash = 1865261300 (for hash = 100 and fnv_prime = -2128831035 like in Java)
Do someone see something I'm missing ?
Thanks in advance for the help !
In Java, this line:
long fnv_prime = 0x811C9DC5;
will yield in fnv_prime the numerical value -2128831035, because the constant is interpreted as an int, which is a 32-bit signed value in Java. That value is then sign-extended when written in a long.
Conversely, in the Objective-C code:
int64_t fnv_prime = 0x811C9DC5;
the 0x811C9DC5 is interpreted as an unsigned int constant (because it does not fit in a signed 32-bit int), with numerical value 2166136261. That value is then written into fnv_prime, and there is no sign to extend since, as far as the C compiler is concerned, the value is positive.
Thus you end up with distinct values for fnv_prime, which explains your distinct results.
This can be corrected in Java by adding a "L" suffix, like this:
long fnv_prime = 0x811C9DC5L;
which forces the Java compiler to interpret the constant as a long, with the same numerical value than what you get with the Objective-C code.
Incidentally, 0x811C9DC5 is not a FNV prime (it is not even prime); it is the 32 bit FNV "offset basis". You will get incorrect hash values if you use this value (and more hash collisions). The correct value for the 32 bit FNV prime is 0x1000193. See http://www.isthe.com/chongo/tech/comp/fnv/index.html
It is a difference in sign extension assigning the 32-bit value 0x811C9DC5 to a 64-bit var.
Are the characters in Java and Objective-c the same? NSString will give you unichars.