perl - int() decrementing an integer - perl

Before I get flamed, I want to say I do understand floating point numbers and things of the sort, but that doesn't seem to be my issue.
To simplify things, I'm trying to determine if a number has more than 2 decimal places. I'm doing this by multiplying the number by 100 (stored under variable "test1") and then truncating it with int() ($test2) and comparing it with an if.
$test1 = $number * 100;
$test2 = int($test1);
unless ($test1 == $test2) {
die ("test1:$test1, test2:$test2");
}
The initial $number comes from a whole series of other functions and should realistically be only two decimals, hence I'm trying to catch those that aren't (as a few entries seem to have very many decimals).
However, I just got:
test1:15, test2:14
from my die().
Can someone explain how that would happen? How can int(15) be 14?

From perldoc:
machine representations of floating-point numbers can sometimes produce counterintuitive results. For example, int(-6.725/0.025) produces -268 rather than the correct -269; that's because it's really more like -268.99999999999994315658 instead
So, the machine representation of "15" is probably something like 14.9999999999999999 and, therefore, int truncates it to 14.
Note that perldoc suggests using the POSIX functions floor or ceil instead.

In a simple, one off, case adding 0.5 to your value before int-ing it will give you what you want.
e.g.
int(14.99 + 0.5)
15
it becomes 15.49 and is int-ed "down" to 15, whereas:
int( 14.45 + 0.5 )
still gets int'ed "down" to 14.0. This is a handy trick but doesn't self document as nicely as using floor and ceil.
As a side note, the Goldberg paper on floating point arithmetic always reminds me how useful it sometimes is to have brains that are not as mindlessly precise as a computer :-)

If I wanted to check if a number had more than two decimal places, I wouldn't do math on it.
my $more_than_two = $number =~ /\d+\.\d{2}\d+\z/;
Before I do that, I might use Scalar::Util's looks_like_a_number. This method will still fail with floating point squishiness if you were expecting 14.99999 to be 15.0.
However, you should tell us what you are trying to do instead of how you are trying to do that. It's easier to give better answers.
For your questions about int, I think it's documentation tell you what you need to know. The rest is answered in the first couple of questions in perlfaq4.

Related

Powershell [ComponentModel.Win32Exception] casting inconsistencies

I am trying to get my head around windows errors, especially the relationship between Win32 errors and HRESULT errors.
So, as an example I know 3010 is
"The requested operation is successful. Changes will not be effective
until the system is rebooted"
And I can get that by casting to ComponentModel.Win32Exception thus: [ComponentModel.Win32Exception]3010.
Also, I know that 3010 is expressed in hex as 0x00000BC2 or 0x0BC2, and I can cast both of those as well. But it can also be expressed as 0x80070bc2, and this will cast properly. And it can even be expressed as 0xFFFFFFFF80070BC2. Here, as a 64 bit hex value it won't cast. But, 0xFFFFFFFF80070BC2 is -2147021886 in decimal, and that will cast. And it's a return value that can be expected, as documented here.
Similarly 0 decimal can be expressed in hex as 0x0000 and 0x00000000 and those cast fine and return
"The operation completed successfully"
But 0xFFFFFFFF00000000 and the decimal equivalent -4294967296 won't cast, they both return the decimal value. But I have gotten that decimal value returned from an installer, and the website referenced above also includes that decimal value.
So, at times when running installers from various vendors I have seen 0, -4294967296, 3010 & -2147021886 returned, and in three of the four situations I can cast to get a meaningful message for the user, and in one I can't.
So, to sum up, why are these BAD values bad, and why are they inconsistent, and what is the best way to deal with values like -4294967296 or -2147945410, the latter of which may never show up, but the former of which I have seen.
[ComponentModel.Win32Exception]3010 # Good
[ComponentModel.Win32Exception]0x0BC2 # Good
[ComponentModel.Win32Exception]-2147021886 # Good
[ComponentModel.Win32Exception]0x00000BC2 # Good
[ComponentModel.Win32Exception]0x0000000000000BC2 # Good
[ComponentModel.Win32Exception]-2147945410 # BAD
[ComponentModel.Win32Exception]0x80070BC2 # Good
[ComponentModel.Win32Exception]0x0000000080070BC2 # Good
[ComponentModel.Win32Exception]0 # Good
[ComponentModel.Win32Exception]0x0000 # Good
[ComponentModel.Win32Exception]0x00000000 # Good
[ComponentModel.Win32Exception]0x0000000000000000 # Good
[ComponentModel.Win32Exception]0xFFFFFFFF00000000 # BAD
[ComponentModel.Win32Exception]-4294967296 # BAD
EDIT: So, I have dug around a bit, and I THINK this might work.
foreach ($errCode in #(0, 3010, -2147021886, -4294967296)) {
[int]$intCode = $errCode -band 0xFFFF
[ComponentModel.Win32Exception]$intCode
}
For the four values in question it does, but I'll need to test with a bunch of others too. Still just don't understand why Microsoft Autodesk and the rest will return values that can't be used easily. Why not just use nothing but the damn Win32 codes and be done with it? And why is 0x80070BC2 treated the same as 0x00000BC2 and both work, but their decimal equivalents are treated differently and only one works?

Exponential values manipulation in perl

I have a select statement which return capacity as exponential value e.g.
Capacity=5.4835615662E+003
in Perl code
I am using a db2 database, and if I explicitly run a query in database it returns
5483.5615662
but when I use next select query when I use capacity value in condition it doesn't match
e.g. pseudo code is as below,
my $capacity = 'SELECT capacity FROM table';
# it returns $capacity = 5.4835615662E+003
my $result = "SELECT MEASUREMENT FROM TABLE WHERE CAPACITY = $capacity";
Here $capacity is 5.4835615662E+003, so it does not match any row in the table. It should be 5483.5615662.
How to convert exponential value to float without rounding off?
You are interpolating the value of $capacity into a string. Instead, you should use placeholders as in:
my $sth = $dbh->prepare(q{SELECT MEASUREMENT FROM TABLE WHERE CAPACITY=?});
$sth->execute($capacity);
It is hard to say if there are any other problems because the code snippets you provide don't really do anything.
It is likely that the number stored in the database is not exactly 5483.5615662 and that is just the displayed string when you query it.
If possible, I would recommend taking #Сухой27's advice and letting the database do the work for you:
SELECT MEASUREMENT FROM TABLE
WHERE CAPACITY = (SELECT CAPACITY FROM TABLE where ..?)
Alternatively, decide ahead of time how many digits past the decimal point really matter and use ROUND or similar functionality:
my $sth = $dbh->prepare(q{
SELECT MEASUREMENT FROM TABLE
WHERE ROUND(CAPACITY, 6)=ROUND(?, 6)
});
$sth->execute($capacity);
Please take a look at Why doesn't this sql query return any results comparing floating point numbers?
I'm concerned about the 5.4835615662+003 that you show in your question. That isn't a valid representation of a number, and it means just 5.4835615662 + 3. You need an E or an e before the exponent to use it as it is
There is also an issue with comparing floating-point values, whereby two numbers that are essentially equal may have a slightly different binary representation, and so will not compare as equal. If your value has been converted to a string (and that seems highly likely, as Perl will not use an exponent to display 5483.5615662 unless told to do so) and back again to floating point, then it is extremely unlikely to result in exactly the same value. Your comparisons will always fail
In Perl, and most other languages, a numeric values has no specific format. For example, if I run this
perl -E 'say 5.4835615662E+003'
I get the output
5483.5615662
showing that the two string representations are equivalent
It would help to see exactly how you got the value of $capacity from the database, because if it were a simple number then it wouldn't use the scientific representation. You would have to use sprintf to get what you have shown
SQL is the same and doesn't care about the format of the number as long as it's valid, so if you wrote
SELECT measurement FROM table WHERE capacity = 5.4835615662E+003
then you would get a result where capacity is exactly equal to that value. But since it has been trimmed to eleven significant digits, you are hugely unlikely to find the record that the value came from, unless it contains 5483.56156620000000000
Update
If I run
perl -MMath::Trig=pi -E 'for (0 .. 20) { $x = pi * 10**$_; say qq{$x}; }'
I get this result
3.14159265358979
31.4159265358979
314.159265358979
3141.59265358979
31415.9265358979
314159.265358979
3141592.65358979
31415926.5358979
314159265.358979
3141592653.58979
31415926535.8979
314159265358.979
3141592653589.79
31415926535897.9
314159265358979
3.14159265358979e+015
3.14159265358979e+016
3.14159265358979e+017
3.14159265358979e+018
3.14159265358979e+019
3.14159265358979e+020
So by default Perl won't resort to using scientific notation until the value reaches 1015. It clearly doesn't apply to 5483.5615662. Something has coerced the floating-point value in the question to a much less precise string in scientific notation. Comparing that for equality doesn't stand a chance of succeeding

Can Perl detect if a floating point number has been implicitly rounded?

When I use the code:
(sub {
use strict;
use warnings;
print 0.49999999999999994;
})->();
Perl outputs "0.5".
And when I remove one "9" from the number:
(sub {
use strict;
use warnings;
print 0.4999999999999994;
})->();
It prints 0.499999999999999.
Only when I remove another 9, it actually stores the number precisely.
I know that floating point numbers are a can of worms nobody wants to deal with, but I am curious if there is a way in Perl to "trap" this implicit conversion and die, so that I can use eval to catch this die and let the user know that the number they are trying to pass is not supported by Perl in its' native form(So the user can maybe pass a string or an object instead).
The reason why I need this is to avoid a situations like passing 0.49999999999999994 to be rounded by my function, but the number gets converted to 0.5, and in turn gets rounded to 1 instead of 0. I am not sure how to "intercept" this conversion so that my function "knows" that it did not actually get 0.5 as input, but that the user's input was intercepted.
Without knowing how to intercept this kind of conversion, I cannot trust "round" because I do not know whether it received my input as I sent it, or if that input has been modified(at compile time or runtime, not sure) before the function was called(and in turn, the function has no idea if the input it is operating on is the input the user intended or not and has no means to warn the user).
This is not a Perl unique problem, it happens in JavaScript:
(() => {
'use strict';
/* oops: 1 */
console.log(Math.round(0.49999999999999999))
})();
It happens in Ruby:
(Proc.new {
# oops: 1
print (0.49999999999999999.round)
}).call()
It happens in PHP:
<?php
(call_user_func(function() {
/* oops: 1 */
echo round(0.49999999999999999);
}));
?>
it even happens in C(which is okay to happen, but my gcc does not warn me that the number has not been stored precisely(when specifying specific floating point literals, they had better be stored exactly, or the compiler should warn you that it decided to turn it into another form(e.g. "Your number x cannot be represented in 64 bit/32 bit floating point form, so I converted it to y." ) so you can see if that's okay or not, in this case it is NOT)):
#include <math.h>
#include <stdio.h>
int main(int argc, char **argv)
{
/* oops: 1 */
printf("%f.\n", round(0.49999999999999999));
return 0;
}
Summary:
Is it possible to make Perl show error or warning on implicit conversions of floating numbers, or is this something that Perl5(along with other languages) are incapable of doing at this moment(e.g. The compiler does not go out of its' way to support such warnings/offer a flag to enable such warnings)?
e.g.
warning: the number 0.49999999999999994 is not representable, it has been converted to 0.5. using bigint might solve this. Consider reducing precision of the number.
Perhaps use BigNum:
$ perl -Mbignum -le 'print 0.49999999999999994'
0.49999999999999994
$ perl -Mbignum -le 'print 0.49999999999999994+0.1'
0.59999999999999994
$ perl -Mbignum -le 'print 0.49999999999999994-0.1'
0.39999999999999994
$ perl -Mbignum -le 'print 0.49999999999999994+10.1'
10.59999999999999994
It transparently extends precision of Perl floating point and ints to extended precision.
be aware that bignum is 150 times slower than internal and other math solutions, and will typicaly NOT solve your problem (as soon as you need to store your numbers in JSON or databases or whatever, you're back at the same problem again).
Typically, sprintf takes care of prettying your output for you, so you do not have to see the ugly imprecision, however, it's still there.
Here is an example which works on my x64 platform which understands how to deal with that imprecision.
This correctly tells you if the 2 numbers you're interested in are the same:
sub safe_eq {
my($var1,$var2)=#_;
return 1 if($var1==$var2);
my $dust;
if($var2==0) { $dust=abs($var1); }
else { $dust= abs(($var1/$var2)-1); }
return 0 if($dust>5.32907051820076e-15 ); # dust <= 5.32907051820075e-15
return 1;
}
You can build on top of this to solve all your problems.
It works by understanding the magnitude of the imprecision in your native numbers, and accommodating it.
As you said in the question, dealing with floating-point numbers in code is quite the can of worms, precisely because the standard floating-point representation, regardless of the precision employed, is incapable of accurately representing many decimal numbers. The only 100% reliable way around this is to not use floating-point numbers.
The easiest way to apply that is to instead use fixed-point numbers, although that limits precision to a fixed number of decimal places. e.g., Instead of storing 10.0050, define a convention that all numbers are stored to 4 decimal places and store 100050 instead.
But that doesn't seem likely to satisfy you, based on the minimal explanation you've given for what you're actually trying to accomplish (building a general-purpose math library). The next option, then, would be to store the number of decimal places as a scaling factor with each value. So 10.0050 would become an object containing the data { value => 100050, scale => 4 }.
This can then be extended into a more general "rational number" data type by effectively storing each number as a numerator and denominator, thus allowing you to precisely store numbers such as 1/3, which neither base 2 nor base 10 can represent exactly. This is, incidentally, the approach that I am told Perl 6 has taken. So, if switching to Perl 6 is an option, then you may find that it all Just Works for you once you do so.

How does this Perl one-liner actually work?

So, I happened to notice that last.fm is hiring in my area, and since I've known a few people who worked there, I though of applying.
But I thought I'd better take a look at the current staff first.
Everyone on that page has a cute/clever/dumb strapline, like "Is life not a thousand times too short for us to bore ourselves?". In fact, it was quite amusing, until I got to this:
perl -e'print+pack+q,c*,,map$.+=$_,74,43,-2,1,-84, 65,13,1,5,-12,-3, 13,-82,44,21, 18,1,-70,56, 7,-77,72,-7,2, 8,-6,13,-70,-34'
Which I couldn't resist pasting into my terminal (kind of a stupid thing to do, maybe), but it printed:
Just another Last.fm hacker,
I thought it would be relatively easy to figure out how that Perl one-liner works. But I couldn't really make sense of the documentation, and I don't know Perl, so I wasn't even sure I was reading the relevant documentation.
So I tried modifying the numbers, which got me nowhere. So I decided it was genuinely interesting and worth figuring out.
So, 'how does it work' being a bit vague, my question is mainly,
What are those numbers? Why are there negative numbers and positive numbers, and does the negativity or positivity matter?
What does the combination of operators +=$_ do?
What's pack+q,c*,, doing?
This is a variant on “Just another Perl hacker”, a Perl meme. As JAPHs go, this one is relatively tame.
The first thing you need to do is figure out how to parse the perl program. It lacks parentheses around function calls and uses the + and quote-like operators in interesting ways. The original program is this:
print+pack+q,c*,,map$.+=$_,74,43,-2,1,-84, 65,13,1,5,-12,-3, 13,-82,44,21, 18,1,-70,56, 7,-77,72,-7,2, 8,-6,13,-70,-34
pack is a function, whereas print and map are list operators. Either way, a function or non-nullary operator name immediately followed by a plus sign can't be using + as a binary operator, so both + signs at the beginning are unary operators. This oddity is described in the manual.
If we add parentheses, use the block syntax for map, and add a bit of whitespace, we get:
print(+pack(+q,c*,,
map{$.+=$_} (74,43,-2,1,-84, 65,13,1,5,-12,-3, 13,-82,44,21,
18,1,-70,56, 7,-77,72,-7,2, 8,-6,13,-70,-34)))
The next tricky bit is that q here is the q quote-like operator. It's more commonly written with single quotes:
print(+pack(+'c*',
map{$.+=$_} (74,43,-2,1,-84, 65,13,1,5,-12,-3, 13,-82,44,21,
18,1,-70,56, 7,-77,72,-7,2, 8,-6,13,-70,-34)))
Remember that the unary plus is a no-op (apart from forcing a scalar context), so things should now be looking more familiar. This is a call to the pack function, with a format of c*, meaning “any number of characters, specified by their number in the current character set”. An alternate way to write this is
print(join("", map {chr($.+=$_)} (74, …, -34)))
The map function applies the supplied block to the elements of the argument list in order. For each element, $_ is set to the element value, and the result of the map call is the list of values returned by executing the block on the successive elements. A longer way to write this program would be
#list_accumulator = ();
for $n in (74, …, -34) {
$. += $n;
push #list_accumulator, chr($.)
}
print(join("", #list_accumulator))
The $. variable contains a running total of the numbers. The numbers are chosen so that the running total is the ASCII codes of the characters the author wants to print: 74=J, 74+43=117=u, 74+43-2=115=s, etc. They are negative or positive depending on whether each character is before or after the previous one in ASCII order.
For your next task, explain this JAPH (produced by EyesDrop).
''=~('(?{'.('-)#.)#_*([]#!#/)(#)#-#),#(##+#)'
^'][)#]`}`]()`#.#]#%[`}%[#`#!##%[').',"})')
Don't use any of this in production code.
The basic idea behind this is quite simple. You have an array containing the ASCII values of the characters. To make things a little bit more complicated you don't use absolute values, but relative ones except for the first one. So the idea is to add the specific value to the previous one, for example:
74 -> J
74 + 43 -> u
74 + 42 + (-2 ) -> s
Even though $. is a special variable in Perl it does not mean anything special in this case. It is just used to save the previous value and add the current element:
map($.+=$_, ARRAY)
Basically it means add the current list element ($_) to the variable $.. This will return a new array with the correct ASCII values for the new sentence.
The q function in Perl is used for single quoted, literal strings. E.g. you can use something like
q/Literal $1 String/
q!Another literal String!
q,Third literal string,
This means that pack+q,c*,, is basically pack 'c*', ARRAY. The c* modifier in pack interprets the value as characters. For example, it will use the value and interpret it as a character.
It basically boils down to this:
#!/usr/bin/perl
use strict;
use warnings;
my $prev_value = 0;
my #relative = (74,43,-2,1,-84, 65,13,1,5,-12,-3, 13,-82,44,21, 18,1,-70,56, 7,-77,72,-7,2, 8,-6,13,-70,-34);
my #absolute = map($prev_value += $_, #relative);
print pack("c*", #absolute);

Float comparison issues in Perl [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
How do I fix this Perl code so that 1.1 + 2.2 == 3.3?
I'm working on a Perl script that compares strings representing gene models and prints out a summary of the comparison. If the gene models match perfectly, I print out a very terse summary, but if they are different, the summary is quite verbose.
The script looks at the value of a variable to determine whether it should do the terse or verbose summary--if the variable is equal to 1, it should print the terse summary; otherwise, it should print the verbose summary.
Since the value is numeric (a float), I've been using the == operator to do the comparison.
if($stats->{overall_simple_matching_coefficient} == 1)
{
print "Gene structures match perfectly!\n";
}
This worked correctly for all of my tests and even for most of the new cases I am running now, but I found a weird case where the value was equal to 1 but the above comparison failed. I have not been able to figure out why the comparison failed, and stranger yet, when I changed the == operator to the eq operator, it seemed to work fine.
I thought the == was for numerical comparison and eq was for string comparison. Am I missing something here?
Update: If I print out the value right before the comparison...
printf("Test: '%f', '%d', '%s'\n", $stats->{overall_simple_matching_coefficient}, $stats->{overall_simple_matching_coefficient}, $stats->{overall_simple_matching_coefficient});
...I get this.
Test: '1.000000', '0', '1'
The first thing any computer language teacher should teach you about any computer language is that YOU CANNOT COMPARE FLOATS FOR EQUALITY. This is true of any language. Floating point arithmetic is not exact, and two floats that look like they're the same will be different in the insignificant digits somewhere where you can't see it. Instead, you can only compare that they are close to each other - like
if (abs(stats->{overall_simple_matching_coefficient)-1) < 0.0001)
What do you get if you print the value of $stats->{overall_simple_matching_coefficient} just before the comparison? If it's 1, try printf with a format of "%20.10f". I strongly suspect you have some rounding error (less then 1e-6) accumulated in the variable and it's not comparing equal numerically. However when converted to string, since the error is right of the 6th decimal place, and the default string format is to six places, it compares equal.