How to use Perl's 'Digest' module to calculate CRC?

How to use Perl's 'Digest' module to calculate CRC? - perl

I need to implement CRC-32 (with a custom polynomial) in Perl. I have seen that there is a module called Digest::CRC that does it. However, when I compare the result to an online calculator, I don't get the same CRC code.
My polynomial is "101101" (bin) or "2d" (hex)
My data is "1e5"
The online calculator is https://ghsi.de/CRC/index.php?Polynom=101101&Message=1e5. The result that I get from the calculator is "1010" (bin) or "A" (hex).
This is the Perl code that I have used (found somewhere on line)
use strict;
use warnings;
use Digest::CRC;
my $string = 0x01e5;
my $ctx = Digest::CRC->new(type => "crc32", poly => 0x2D);
$ctx->add($string);
print "CRC for '$string' is 0x" . $ctx->hexdigest . "\n";
This is the output of this Perl code:
CRC for '485' is 0x9d0fec86
I'm pretty sure that the online calculator is correct.
What is wrong with my Perl code?

Your program is, as it says, calculating the CRC for the string 485 (bytes 34 38 35), which is the decimal string representation for the number 0x1E5. Meanwhile the web site is calculating the CRC for the bytes 01 e5. I can't tell which one, if either, you want.
What is definitely true is that the web site isn't calculating any sort of CRC32, because its results aren't 32-bits long and seem to depend on the size of the polynomial you specify.
Also, if you use Digest::CRC specifying type => 'crc32' it will ignore all the other parameters and simply calculate a standard CRC32.
If you want a 32-bit CRC with a polynomial of 0x2D then you can try
my $ctx = Digest::CRC->new(width => 32, poly => 0x2D);
but there are several other things you need to define to specify a CRC, including (but not limited to) bit and byte order, initial value and end xor value, and there is no way of telling whether this will give you the correct checksum without seeing the full specification.
Surely you have a document that says something more than "CRC32, polynomial 0x2d"?
Update
How can I use the Digest::CRC to treat the data as hex bytes and not as a string?
Digest::CRC only processes strings and you need to pack your data that way. In this case you probably want my $string = "\x01\xe5"
In addition, what is the "end xor value"?
The end xor value is simply a bit pattern that is XORed with the result as the last step to get the final CRC.
In addition If I understood you correctly, the following 2 methods should give the same result:
my $ctx1 = Digest::CRC->new(type => "crc32");
my $rr1 = $ctx1->add(pack 'H*', '1e5')->hexdigest;
print "a1=$rr1=\n";
my $ctx2 = Digest::CRC->new(width => 32, poly => 0x04c11db7);
my $rr2 = $ctx2->add(pack 'H*', '1e5')->hexdigest;
print "a2=$rr2=\n";
However I get different results:
a1=fef37cd4= a2=758cce0=
Can you tell me where is my mistake?
As I said, there are many specifiers for a CRC. That is why you must establish the full specification of the CRC that you need, including more than just the width and the polynomial. To explicitly produce a CRC32 checksum you would need this
my $ctx = Digest::CRC->new(width => 32, poly => 0x04c11db7, init => 0xFFFFFFFF, xorout => 0xFFFFFFFF, refin => 1, refout => 1);
This applies initial and final values of 0xFFFFFFFF and sets refin and refout to true. This reverses the bit order (ref is short for reflect) both before and after processing, and is the difference between MSB first and LSB first.

Related

Perl variable assignment side effects

I'll be the first to admit that Perl is not my strong suit. But today I ran across this bit of code:
my $scaledWidth = int($width1x * $scalingFactor);
my $scaledHeight = int($height1x * $scalingFactor);
my $scaledSrc = $Media->prependStyleCodes($src, 'SX' . $scaledWidth);
# String concatenation makes this variable into a
# string, so we need to make it an integer again.
$scaledWidth = 0 + $scaledWidth;
I could be missing something obvious here, but I don't see anything in that code that could make $scaledWidth turn into a string. Unless somehow the concatenation in the third line causes Perl to permanently change the type of $scaledWidth. That seems ... wonky.
I searched a bit for "perl assignment side effects" and similar terms, and didn't come up with anything.
Can any of you Perl gurus tell me if that commented line of code actually does anything useful? Does using an integer variable in a concatenation expression really change the type of that variable?

It is only a little bit useful.
Perl can store a scalar value as a number or a string or both, depending on what it needs.
use Devel::Peek;
Dump($x = 42);
Dump($x = "42");
Outputs:
SV = PVIV(0x139a808) at 0x178a0b8
REFCNT = 1
FLAGS = (IOK,pIOK)
IV = 42
PV = 0x178d9e0 "0"\0
CUR = 1
LEN = 16
SV = PVIV(0x139a808) at 0x178a0b8
REFCNT = 1
FLAGS = (POK,pPOK)
IV = 42
PV = 0x178d9e0 "42"\0
CUR = 2
LEN = 16
The IV and IOK tokens refer to how the value is stored as a number and whether the current integer representation is valid, while PV and POK indicate the string representation and whether it is valid. Using a numeric scalar in a string context can change the internal representation.
use Devel::Peek;
$x = 42;
Dump($x);
$y = "X" . $x;
Dump($x);
SV = IV(0x17969d0) at 0x17969e0
REFCNT = 1
FLAGS = (IOK,pIOK)
IV = 42
SV = PVIV(0x139aaa8) at 0x17969e0
REFCNT = 1
FLAGS = (IOK,POK,pIOK,pPOK)
IV = 42
PV = 0x162fc00 "42"\0
CUR = 2
LEN = 16
Perl will seamlessly convert one to the other as needed, and there is rarely a need for the Perl programmer to worry about the internal representation.
I say rarely because there are some known situations where the internal representation matters.

Perl variables are not typed. Any scalar can be either a number or a string depending how you use it. There are a few exceptions where an operation is dependent on whether a value seems more like a number or string, but most of them have been either deprecated or considered bad ideas. The big exception is when these values must be serialized to a format that explicitly stores numbers and strings differently (commonly JSON), so you need to know which it is "supposed" to be.
The internal details are that a SV (scalar value) contains any of the values that have been relevant to its usage during its lifetime. So your $scaledWidth first contains only an IV (integer value) as the result of the int function. When it is concatenated, that uses it as a string, so it generates a PV (pointer value, used for strings). That variable contains both, it is not one type or the other. So when something like JSON encoders need to determine whether it's supposed to be a number or a string, they see both in the internal state.
There have been three strategies that JSON encoders have taken to resolve this situation. Originally, JSON::PP and JSON::XS would simply consider it a string if it contains a PV, or in other words, if it's ever been used as a string; and as a number if it only has an IV or NV (double). As you alluded to, this leads to an inordinate amount of false positives.
Cpanel::JSON::XS, a fork of JSON::XS that fixes a large number of issues, along with more recent versions of JSON::PP, use a different heuristic. Essentially, a value will still be considered a number if it has a PV but the PV matches the IV or NV it contains. This, of course, still results in false positives (example: you have the string '5', and use it in a numerical operation), but in practice it is much more often what you want.
The third strategy is the most useful if you need to be sure what types you have: be explicit. You can do this by reassigning every value to explicitly be a number or string as in the code you found. This assigns a new SV to $scaledWidth that contains only an IV (the result of the addition operation), so there is no ambiguity. Another method of being explicit is using an encoding method that allows specifying the types you want, like Cpanel::JSON::XS::Type.
The details of course vary if you're not talking about the JSON format, but that is where this issue has been most deliberated. This distinction is invisible in most Perl code where the operation, not the values, determine the type.

What is the point of writing integer in hexadecimal, octal and binary?

I am well aware that one is able to assign a value to an array or constant in Swift and have those value represented in different formats.
For Integer: One can declare in the formats of decimal, binary, octal or hexadecimal.
For Float or Double: One can declare in the formats of either decimal or hexadecimal and able to make use of the exponent too.
For instance:
var decInt = 17
var binInt = 0b10001
var octInt = 0o21
var hexInt = 0x11
All of the above variables gives the same result which is 17.
But what's the catch? Why bother using those other than decimal?

There are some notations that can be way easier to understand for people even if the result in the end is the same. You can for example think in cases like colour notation (hexadecimal) or file permission notation (octal).

Code is best written in the most meaningful way.
Using the number format that best matches the domain of your program, is just one example. You don't want to obscure domain specific details and want to minimize the mental effort for the reader of your code.
Two other examples:
Do not simplify calculations. For example: To convert a scaled integer value in 1/10000 arc minutes to a floating point in degrees, do not write the conversion factor as 600000.0, but instead write 10000.0 * 60.0.
Chose a code structure that matches the nature of your data. For example: If you have a function with two return values, determine if it's a symmetrical or asymmetrical situation. For a symmetrical situation always write a full if (condition) { return A; } else { return B; }. It's a common mistake to write if (condition) { return A; } return B; (simply because 'it works').
Meaning matters!

Perl booleans, negation (and how to explain it)?

I'm new here. After reading through how to ask and format, I hope this will be an OK question. I'm not very skilled in perl, but it is the programming language what I known most.
I trying apply Perl to real life but I didn't get an great understanding - especially not from my wife. I tell her that:
if she didn't bring to me 3 beers in the evening, that means I got zero (or nothing) beers.
As you probably guessed, without much success. :(
Now factually. From perlop:
Unary "!" performs logical negation, that is, "not".
Languages, what have boolean types (what can have only two "values") is OK:
if it is not the one value -> must be the another one.
so naturally:
!true -> false
!false -> true
But perl doesn't have boolean variables - have only a truth system, whrere everything is not 0, '0' undef, '' is TRUE. Problem comes, when applying logical negation to an not logical value e.g. numbers.
E.g. If some number IS NOT 3, thats mean it IS ZERO or empty, instead of the real life meaning, where if something is NOT 3, mean it can be anything but 3 (e.g. zero too).
So the next code:
use 5.014;
use Strictures;
my $not_3beers = !3;
say defined($not_3beers) ? "defined, value>$not_3beers<" : "undefined";
say $not_3beers ? "TRUE" : "FALSE";
my $not_4beers = !4;
printf qq{What is not 3 nor 4 mean: They're same value: %d!\n}, $not_3beers if( $not_3beers == $not_4beers );
say qq(What is not 3 nor 4 mean: #{[ $not_3beers ? "some bears" : "no bears" ]}!) if( $not_3beers eq $not_4beers );
say ' $not_3beers>', $not_3beers, "<";
say '-$not_3beers>', -$not_3beers, "<";
say '+$not_3beers>', -$not_3beers, "<";
prints:
defined, value><
FALSE
What is not 3 nor 4 mean: They're same value: 0!
What is not 3 nor 4 mean: no bears!
$not_3beers><
-$not_3beers>0<
+$not_3beers>0<
Moreover:
perl -E 'say !!4'
what is not not 4 IS 1, instead of 4!
The above statements with wife are "false" (mean 0) :), but really trying teach my son Perl and he, after a while, asked my wife: why, if something is not 3 mean it is 0 ? .
So the questions are:
how to explain this to my son
why perl has this design, so why !0 is everytime 1
Is here something "behind" what requires than !0 is not any random number, but 0.
as I already said, I don't know well other languages - in every language is !3 == 0?

I think you are focussing to much on negation and too little on what Perl booleans mean.
Historical/Implementation Perspective
What is truth? The detection of a higher voltage that x Volts.
On a higher abstraction level: If this bit here is set.
The abstraction of a sequence of bits can be considered an integer. Is this integer false? Yes, if no bit is set, i.e. the integer is zero.
A hardware-oriented language will likely use this definition of truth, e.g. C, and all C descendants incl Perl.
The negation of 0 could be bitwise negation—all bits are flipped to 1—, or we just set the last bit to 1. The results would usually be decoded as integers -1 and 1 respectively, but the latter is more energy efficient.
Pragmatic Perspective
It is convenient to think of all numbers but zero as true when we deal with counts:
my $wordcount = ...;
if ($wordcount) {
say "We found $wordcount words";
} else {
say "There were no words";
}
or
say "The array is empty" unless #array; # notice scalar context
A pragmatic language like Perl will likely consider zero to be false.
Mathematical Perspective
There is no reason for any number to be false, every number is a well-defined entity. Truth or falseness emerges solely through predicates, expressions which can be true or false. Only this truth value can be negated. E.g.
¬(x ≤ y) where x = 2, y = 3
is false. Many languages which have a strong foundation in maths won't consider anything false but a special false value. In Lisps, '() or nil is usually false, but 0 will usually be true. That is, a value is only true if it is not nil!
In such mathematical languages, !3 == 0 is likely a type error.
Re: Beers
Beers are good. Any number of beers are good, as long as you have one:
my $beers = ...;
if (not $beers) {
say "Another one!";
} else {
say "Aaah, this is good.";
}
Boolification of a beer-counting variable just tells us if you have any beers. Consider !! to be a boolification operator:
my $enough_beer = !! $beers;
The boolification doesn't concern itself with the exact amount. But maybe any number ≥ 3 is good. Then:
my $enough_beer = ($beers >= 3);
The negation is not enough beer:
my $not_enough_beer = not($beers >= 3);
or
my $not_enough_beer = not $beers;
fetch_beer() if $not_enough_beer;
Sets
A Perl scalar does not symbolize a whole universe of things. Especially, not 3 is not the set of all entities that are not three. Is the expression 3 a truthy value? Yes. Therefore, not 3 is a falsey value.
The suggested behaviour of 4 == not 3 to be true is likely undesirable: 4 and “all things that are not three” are not equal, the four is just one of many things that are not three. We should write it correctly:
4 != 3 # four is not equal to three
or
not( 4 == 3 ) # the same
It might help to think of ! and not as logical-negation-of, but not as except.
How to teach
It might be worth introducing mathematical predicates: expressions which can be true or false. If we only ever “create” truthness by explicit tests, e.g. length($str) > 0, then your issues don't arise. We can name the results: my $predicate = (1 < 2), but we can decide to never print them out, instead: print $predicate ? "True" : "False". This sidesteps the problem of considering special representations of true or false.
Considering values to be true/false directly would then only be a shortcut, e.g. foo if $x can considered to be a shortcut for
foo if defined $x and length($x) > 0 and $x != 0;
Perl is all about shortcuts.
Teaching these shortcuts, and the various contexts of perl and where they turn up (numeric/string/boolean operators) could be helpful.
List Context
Even-sized List Context
Scalar Context
Numeric Context
String Context
Boolean Context
Void Context

as I already said, I don't know well other languages - in every language is !3 == 0?
Yes. In C (and thus C++), it's the same.
void main() {
int i = 3;
int n = !i;
int nn = !n;
printf("!3=%i ; !!3=%i\n", n, nn);
}
Prints (see http://codepad.org/vOkOWcbU )
!3=0 ; !!3=1
how to explain this to my son
Very simple. !3 means "opposite of some non-false value, which is of course false". This is called "context" - in a Boolean context imposed by negation operator, "3" is NOT a number, it's a statement of true/false.
The result is also not a "zero" but merely something that's convenient Perl representation of false - which turns into a zero if used in a numeric context (but an empty string if used in a string context - see the difference between 0 + !3 and !3 . "a")
The Boolean context is just a special kind of scalar context where no conversion to a string or a number is ever performed. (perldoc perldata)
why perl has this design, so why !0 is everytime 1
See above. Among other likely reasons (though I don't know if that was Larry's main reason), C has the same logic and Perl took a lot of its syntax and ideas from C.
For a VERY good underlying technical detail, see the answers here: " What do Perl functions that return Boolean actually return " and here: " Why does Perl use the empty string to represent the boolean false value? "
Is here something "behind" what requires than !0 is not any random number, but 0.
Nothing aside from simplicity of implementation. It's easier to produce a "1" than a random number.
if you're asking a different question of "why is it 1 instead of the original # that was negated to get 0", the answer to that is simple - by the time Perl interpreter gets to negate that zero, it no longer knows/remembers that zero was a result of "!3" as opposed to some other expression that resulted in a value of zero/false.

If you want to test that a number is not 3, then use this:
my_variable != 3;
Using the syntax !3, since ! is a boolean operator, first converts 3 into a boolean (even though perl may not have an official boolean type, it still works this way), which, since it is non-zero, means it gets converted to the equivalent of true. Then, !true yields false, which, when converted back to an integer context, gives 0. Continuing with that logic shows how !!3 converts 3 to true, which then is inverted to false, inverted again back to true, and if this value is used in an integer context, gets converted to 1. This is true of most modern programming languages (although maybe not some of the more logic-centered ones), although the exact syntax may vary some depending on the language...

Logically negating a false value requires some value be chosen to represent the resulting true value. "1" is as good a choice as any. I would say it is not important which value is returned (or conversely, it is important that you not rely on any particular true value being returned).

What is the best way to store 1key - 3 value in Perl?

I have a situation where I have 3 different values for each key. I have to print the data like this:
K1 V1 V2 V3
K2 V1 V2 V3
…
Kn V1 V2 V3
Is there any alternate efficient & easier way to achieve this other that that listed below? I am thinking of 2 approaches:
Maintain 3 hashes for 3 different values for each key.
Iterate through one hash based on the key and get the values from other 2 hashes
and print it.
Hash 1 - K1-->V1 ...
Hash 2 - K1-->V2 ...
Hash 3 - K1-->V3 ...
Maintain a single hash with key to reference to array of values.
Here I need to iterate and read only 1 hash.
K1 --> Ref{V1,V2,V3}
EDIT1:
The main challenge is that, the values V1, V2, V3 are derived at different places and cannot be pushed together as the array. So if I make the hash value as a reference to array, I have to dereference it every time I want to add the next value.
E.g., I am in subroutine1 - I populated Hash1 - K1-->[V1]
I am in subroutine2 - I have to de-reference [V1], then push V2. So now the hash
becomes K1-->[V1 V2], V3 is added in another routine. K1-->[V1 V2 V3]
EDIT2:
Now I am facing another challenge. I have to sort the hash based on the V3.
Still is it feasible to store the hash with key and list reference?
K1-->[V1 V2 V3]

It really depends on what you want to do with your data, although I can't imagine your option 1 being convenient for anything.
Use a hash of arrays if you are happy referring to your V1, V2, V3 using indexes 0, 1, 2 or if you never really want to handle their values separately.
my %data;
$data{K1}[0] = V1;
$data{K1}[1] = V2;
$data{K1}[2] = V3;
or, of course
$data{K1} = [V1, V2, V3];
As an additional option, if your values mean something nameable you could use a hash of hashes, so
my %data;
$data{K1}{name} = V1;
$data{K1}{age} = V2;
$data{K1}{height} = V3;
or
$data{K1}{qw/ name age height /} = (V1, V2, V3);
Finally, if you never need access to the individual values, it would be fine to leave them as they are in the file, like this
my %data;
$data{K1} = "V1 V2 V3";
But as I said, the internal storage is mostly dependent on how you want to access your data, and you haven't told us about that.
Edit
Now that you say
The main challenge is that, the values V1, V2, V3 are derived at
different places and cannot be pushed together as the array
I think perhaps the hash of hashes is more appropriate, but I wouldn't worry at all about dereferencing as it is an insignificant operation as far as execution time is concerned. But I wouldn't use push as that restricts you to adding the data in the correct order.
Depending which you prefer, you have the alternatives of
$data{K1}[2] = V3;
or
$data{K1}{height} = V3;
and clearly the latter is more readable.
Edit 2
As requested, to sort a hash of hashes by the third value (height in my example) you would write
use strict;
use warnings;
my %data = (
K1 => { name => 'ABC', age => 99, height => 64 },
K2 => { name => 'DEF', age => 12, height => 32 },
K3 => { name => 'GHI', age => 56, height => 9 },
);
for (sort { $data{$a}{height} <=> $data{$b}{height} } keys %data) {
printf "%s => %s %s %s\n", $_, #{$data{$_}}{qw/ name age height / };
}
or, if the data was stored as a hash of arrays
use strict;
use warnings;
my %data = (
K1 => [ 'ABC', 99, 64 ],
K2 => [ 'DEF', 12, 32 ],
K3 => [ 'GHI', 56, 9 ],
);
for (sort { $data{$a}[2] <=> $data{$b}[2] } keys %data) {
printf "%s => %s %s %s\n", $_, #{$data{$_}};
}
The output for both scripts is identical
K3 => GHI 56 9
K2 => DEF 12 32
K1 => ABC 99 64

In terms of readability/maintainability the second seems superior to me. The danger with the first is that you could end up with keys present in one hash but not the others. Also, if I came across the first approach, I'd have to think about it for a while, whereas the first seems "natural" (or a more common idiom, or more practical, or something else which means I'd understand it more readily).

The second approach (one array reference for each key) is:
In my experience, far more common,
Easier to maintain, since you only have one data structure floating around instead of three, and
More in line with the DRY principle: "Every piece of knowledge must have a single, unambiguous, authoritative representation within a system." Represent a key once, not three times.

Sure, it's better to mantain only one data structure:
%data = ( K1=>[V1, V2, V3], ... );
You can use Data::Dump for a fast view/debug of your data structure.

The choice really depends on the usage pattern. Specifically, it depends on whether you use procedural program or object-oriented programming.
This is a philosophical difference, and it's unrelated to whether language-level classes and objects are used or not. Procedural programming is organised around work flow; procedures accesses and transforms whatever data it needs. OOP is organised around records of data; methods access and transform one particular record only.
The second approach is closely aligned with object-oriented programming. Object-oriented programming is by far the most common programming style in Perl, so the second approach is almost universally the preferred structure these days (even though it takes more memory).
But your edit implied you might be using a more a procedural approach. As you discovered, the first approach is more convenient for procedural programming. It was very commonly used when procedural programming was in vogue.
Take whatever suits your code's organisation best.

format a number as I want

I have the number: a = 3.860575156847749e+003; and I would show it in a normal manner. So I write b = sprintf('%0.1f' a);. If I print b I will get: 3860.6. This is perfect. Matter of fact, while a is a double type, b has been converted in char.
What can I do to proper format that number and still have a number as final result?
Best regards

Well, you have to distinguish between both the numerical value (the number stored in your computer's memory) and its decimal representation (the string/char array you see on your screen). You can't really impose a format on a number: a number has a value which can be represented as a string in different ways (e.g. 1234 = 1.234e3 = 12.34e2 = 0.1234e4 = ...).
If you want to store a number with less precision, you can use round, floor, ceil to calculate a number which has less precision than the original number.
E.g. if you have a = 3.860575156847749e+003 and you want a number that only has 5 significant digits, you can do so by using round:
a = 3.860575156847749e+003;
p = 0.1; % absolute precision you want
b = p .* round(a./p)
This will yield a variable b = 3.8606e3 which can be represented in different ways, but should contain zeros (in practice: very small values are sometimes unavoidable) after the fifth digit. I think that is what you actually want, but remember that for a computer this number is equal to 3.86060000 as well (it is just another string representation of the same value), so I want to stress again that the decimal representation is not set by rounding the number but by (implicitly) calling a function that converts the double to a string, which happens either by sprintf, disp or possibly some other functions.

Result of sprintf y a text variable. have you tried to declare a variable as integer (for example) and use this as return value for sprintf instruction?
This can be useful to you: http://blogs.mathworks.com/loren/2006/12/27/displaying-numbers-in-matlab/

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse