sprintf/printf right pad float with zeros in fixed width field - perl

I am using PERL (for legacy reasons) and I would like to format fixed width columns in a CSV file. How do I format the following values:
1.0001
10.0001
100.0001
1000.0001
1000000.1
100000001
into fixed width of 8 by right padding floats with zeros or truncating, BUT if a large integer is encountered the field width must grow to accomodate:
1.000100
10.00010
100.0001
1000.000
1000000.
100000001
I am not performing any operations, so they could possibly be treated as strings or other. I've tried about every combination in the sprintf documentation.
Thanks.

[The question was changed after this was posted. This no longer answers the question.]
substr(sprintf("%.6f", $x), 0, 8)
or
substr($x.("0"x5), 0, 8)

There's probably a neater way, but this example should work:
my #array = qw(1.0001 10.0001 100.0001 1000.0001);
for my $nums (#array) {
$nums .= '0' while length $nums < 8;
print "$nums\n";
}
1.000100
10.00010
100.0001
1000.0001

Related

Adding an array of floats produces weird sums adding forward vs. backward

I'm adding (summing) and array of floats in perl, and I was trying to speed it up. When I tried, I started getting weird results.
#!/usr/bin/perl
my $total = 0;
my $sum = 0;
# Compute $sum (adds from index 0 forward)
my #y = #{$$self{"closing"}}[-$periods..-1];
my #x = map {${$_}{$what}} #y;
# map { $sum += $_ } #x;
$sum += $_ for #x;
# Compute $total (adds from index -1 backward)
for (my $i = -1; $i >= -$periods; $i--) {
$total += ${${$$self{"closing"}}[$i]}{$what};
}
if($total != $sum) {
printf("SMA($what, $periods) total ($total) != sum ($sum) %g\n",
($total - $sum));
}
# Example output:
# SMA(close, 20) total (941.03) != sum (941.03) -2.27374e-13
I seem to get different answers when I compute $sum and $total.
The only thing I can think of is that one method adds forward through the array, and the other backward.
Would this cause them to overflow differently? I would expect so, but it never occurred to me that I would get different answers. Notice that the difference is small (-2.27374e-13).
Is this what's going on, or is my code busted?
This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux-thread-multi
As Eric mentioned in the comments, floating point arithmetic is not associative; so the order you do the operations will impact the answer.
While "add smaller values first" is good advice, it is important to emphasize that you can have differences even with just regular "small" values. Here's one example:
x = 1.004028
y = 3.0039678
z = 4.000855
If these are taken to be IEEE-754 single-precision floats (i.e., 32-bit binary format), then we get:
x + (y+z) = 8.008851
(x+y) + z = 8.00885
Infinitely precise result is 8.0088508. So neither are very good! And the error isn't insignificant for scientific computations and it accumulates.
This is a rich field with many numerical algorithms to ensure precision. While which one you pick entirely depends on your problem domain and particular needs and resources you have available, one of the best-known algorithms is Kahan's summation algorithm, see: https://en.wikipedia.org/wiki/Kahan_summation_algorithm. You can easily adopt it to your problem for (hopefully) better results.

How do I display a large number in scientific notation?

Using AutoIt, when I multiply 1 by 10^21, I get 1e+021. But in separate steps, such as multiplying 1 by 10^3 seven times, I get the overflow value of 3875820019684212736.
It appears AutoIt cannot handle numbers with more than eighteen digits. Is there a way around this? For example, can I multiply 10,000,000,000,000,000 by 1000 and have the result displayed as 1e+019?
Try this UDF : BigNum UDF
Example :
$X = "9999999999999999999999999999999"
$Y = "9999999999999999999999999999999"
$product = _BigNum_Mul($X, $Y)

How to read elevation from USGS NED DEM GridFloat file in Perl

I have downloaded a large set of GridFloat (.flt, .hdr) DEM files from USGS NED (1") in order to implement my own elevation service on my website. I would like to be able to look up an elevation from this fileset, given latitude and longitude as inputs. I use Perl for my website development. The files have a conventional naming scheme, and I am able to get the appropriate tile filename using the lat/lng. Howevever, accessing the internals of the file is where I'm having an issue.
I know the file is in a fairly straightforward format (.flt, apparently called "Gridfloat"), but I could use some help figuring out the magic numbers for calculating where in the file I need to seek to for a given lat/lng, and how to handle byte order and so on so that I end up with an elevation. From what I understand, apparently row ordering can be an issue, as well as byte ordering. I am looking for a recipe that does not involve use of any third party libraries such as GDAL, which I think are overly complicated and slow for what I want to do. I think it should be possible to just open the file, seek to a position based on some calculation, read some bytes and then unpack them into the correct byte order. Here is an example .hdr file that accompanies floatn48w097_1.flt, I think it has the necessary info. There are a bunch of other files that come with the .zip, including .prj, but I believe those are for a commercial program like ArcInfo. I think everything I need should be in the following .hdr file.
ncols 3612
nrows 3612
xllcorner -97.00166666667
yllcorner 46.99833333333
cellsize 0.000277777777778
NODATA_value -9999
byteorder LSBFIRST
What I'm really hoping for is a formula for calculating the row and column from the lat/lng, then another formula for translating the row/column into a position for seek, how many bytes to read, and how to convert those raw bytes into an integer (or whatever it is these files contain). I feel that this could be a very fast operation, without all the overhead involved with the larger libraries which seem to be focused on doing a lot of stuff that I don't need.
I don't need Perl code, just pseudocode showing the calculations for row/col offsets etc would be more than enough. I believe the files are binary format, a straightforward grid of 4-byte numbers. The file example that goes with the .hdr file above has a size of 52186176, and when you multiply the ncols by nrows (from the .hdr), you get 13046544. which divides nicely into the file size by 4. So I assume it's just a matter of getting the right formula for row/col based on lat/lng, and then getting the bytes swizzled into the right order. I've just not done this much.
I found some reference to the Gridfloat format here: coolutils.com/formats/flt so apparently the file consists of a grid of 64-bit floating point values.
Thanks!
Ok, I think I have an answer. The following is Perl routine, which seems to give back reasonable looking elevation values when tested with the USGS NED1 .flt files. The script takes latitude and longitude as command line arguments, looks up the file and indexes into the grid.
#!/usr/bin/perl
use strict;
use POSIX;
use Math::Round;
sub get_elevation
{
my ($lat, $lng) = #_;
my $lat_degree = ceil ($lat);
my $lng_degree = floor ($lng);
my $lat_letter = ($lat >= 0) ? 'n' : 's';
my $lng_letter = ($lng >= 0) ? 'e' : 'w';
my $lng_tilenum = abs($lng_degree);
my $lat_tilenum = abs($lat_degree);
my $tilename = $lat_letter . sprintf('%02d', $lat_tilenum) . $lng_letter . sprintf('%03d',$lng_tilenum);
my $path = "/data/elevation/ned1/$tilename/float${tilename}_1.flt";
print "path = $path\n";
die "No such file" if (!-e($path));
my ($lat_fraction, $lat_integral) = modf (abs($lat));
my $row = floor ((1 - $lat_fraction) * 3600);
my ($lng_fraction, $lng_integral) = modf (abs($lng));
my $col = floor ((1 - $lng_fraction) * 3600);
open(FILE, "<$path");
my $pos = (3612 * 4 * 6) + (3612 * 4 * $row) + (4 * 6) + ($col * 4);
seek (FILE, $pos, SEEK_SET);
my $buffer;
read (FILE, $buffer, 4);
close (FILE);
my ($elevation) = unpack('f', $buffer);
if ($elevation == -9999)
{
return 'undefined';
}
return $elevation;
}
my $lat = $ARGV[0];
my $lng = $ARGV[1];
my $elevation = get_elevation ($lat, $lng);
print "Elevation for ($lat, $lng) = $elevation meters (", $elevation * 3.28084, " feet)\n";
Hope this might be useful to anyone else trying to do the same kind of thing... I've tested this method now and it seems to produce good looking elevation profiles which are smoother than those from the 3" SRTM data.
Neil put me on the right track but I think there's a few problems with his original answer. I've added some fixes and improvements including on-the-fly download of the needed tile from the 1/3 arc second (10 meter) dataset, proper parsing of the header file, and what I believe is corrected indexing.
This is still mostly illustrative and should be improved before production use, particularly, hanging on to the header information and the file handle for repeated queries.
https://gist.github.com/biomiker/32fe34e1fa1bb49ae1135ab6652f596d

How do I round up a float whose mantissa is ending in 5?

I want to round up a floating-point number. For example, sprintf '%.4f', 0.12345 returns 0.1235 and sprintf '%.4f', 0.12325 returns 0.1232. For the second example, I want it to print out 0.1233, not 0.1232.
sprintf in Perl is not good enough. Math::BigFloat can do it, but it’s a little over-kill.
Does anyone know if there is other effective way to round up, or whether there is any other module in Perl?
The exact way sprintf will round a number is dependent on how the system libraries round numbers. If you need to control exactly how a number is rounded you will need to use a library that implements your desired rounding explicitly, or write a function to round as desired.
For example to round a positive number to an arbitrary precision, where 5 rounds up this would work.
sub round {
my $numer = shift;
my $precision = shift;
return int($numer * 10 ** $precision + 0.5) * 10 ** -$precision;
}
This however doesn't round correctly for negative numbers, -0.12324 incorrectly rounds to -0.1231. A solution where 5 should round up (that is towards positive infinity) would be to use floor instead of int.
use POSIX qw(floor);
sub round {
my $numer = shift;
my $precision = shift;
return floor($numer * 10 ** $precision + 0.5) * 10 ** -$precision;
}
If instead 5 should round to the largest absolute value (that is round away from 0) then you can add a simple check for negative numbers to round them in the correct direction.
sub round {
my $numer = shift;
my $precision = shift;
my $direction = $numer >= 0 ? 0.5 : -0.5;
return int($numer * 10 ** $precision + $direction) * 10 ** -$precision;
}
There are more complex rules for rounding used in some circumstances where the bias from rounding 5 in one direction for all numbers is unacceptable and any (decimal) rounding of floating point numbers is subject to possible errors due to imprecision in their binary format.
Perl's sprintf can only be as good as the data it's fed.
There is no floating point number equal exact to either 0.12345 or 0.12325:
The floating point number closest to the 0.12345 is exactly 2223877495995551/2**54 ≈ 0.123450000000000004, a value slightly greater than the input.
The floating point number closest to the 0.12325 is exactly 4440549232587309/2**55 ≈ 0.123249999999999998, a value slightly less than the input.
Note that the 5th decimal digit of the latter one is 4 not 5.
When rounded to 4 decimal places under the usual rules, they must come out as 0.1235 and 0.1232 respectively.
You say that using Math::BigFloat is a little overkill, but it seems that what you really want is decimal arithmetic so the overkill is inescapable.
This did work for me:
print sprintf '%.*f', 4, 0.12325
Your code does work for me too. Did you test that code on its own?
Could you tell us your perl version too (perl -v)?

How can I write in scientific notation using Perl formats?

I've always used printf, and I've never used write/format. Is there any way to reproduce printf("%12.5e", $num) using a format? I'm having trouble digesting the perlform documentation, but I don't see a straightforward way of doing this.
EDIT: based on the answers I got, I'm just gonna keep on using printf.
Short answer, don't use formats.
Unresearched answer, sure, just use sprintf:
#!/usr/bin/perl
use strict;
use warnings;
our $num = .005;
write;
format STDOUT =
#>>>>>>>>>>>>>>>>>
sprintf("%12.5e", $num)
.
Seriously, if you need something like Perl 5 formats, take a look at Perl6::Form (note, this is a Perl 5 module, it just implements the proposed Perl 6 version of formats).
I totally agree with Chas. Owens on formats in general. Format was really slick 15 years ago, but format has not kept up with the advancements of the rest of Perl.
Here is a technique for line oriented output that I use time to time. You can use formline which is one of the public internal functions used by format. Format is page oriented. It is very hard to do things like span columns or change the format by line depending on the data. You can format a single line using the same text formatting logic used by format and then output that result yourself.
A (messy) example:
use strict; use warnings;
sub print_line {
my $pic=shift;
my #args=#_;
formline($pic,#args);
print "$^A\n";
$^A='';
}
my ($wlabel, $wlow, $whigh, $wavg)=(0,0,0,0);
my ($plabel,$plow,$phigh, $pavg);
my ($s_low,$s_high,$s_avg)=qw(%.2f %.2e %.2f);
my #results=( ["Label 1", 3.445, 0.00006678, .025],
["Label 2", 12.5555556, 55.112, 1.11],
["Wide Label 3", 1231.11, 1555.0, 66.66] );
foreach (#results) {
my $tmp;
$tmp=length($_->[0]);
$wlabel=$tmp if $tmp>$wlabel;
$tmp=length(sprintf($s_low,$_->[3]));
$wlow=$tmp if $tmp>$wlow;
$tmp=length(sprintf($s_high,$_->[2]));
$whigh=$tmp if $tmp>$whigh;
$tmp=length(sprintf($s_avg,$_->[1]));
$wavg=$tmp if $tmp>$wavg;
}
print "\n\n";
my #a1=("Label", "Rate - Operations / sec");
my #a2=("Text", "Average", "High", "Low");
my #a3=("----------", "-------", "----", "---");
my $l1fmt="#".'|' x $wlabel." #".'|'x($whigh+$wavg+$wlow+6);
my $l2fmt="#".'|' x $wlabel." #".'|' x $wavg." #".'|' x $whigh .
" #".'|' x $wlow;
print_line($l1fmt,#a1);
print_line($l2fmt,#a2);
print_line($l2fmt,#a3);
$plabel="#".'>' x $wlabel;
$phigh="#".'>' x $whigh;
$pavg="#".'>' x $wavg;
$plow="#".'<' x $wlow;
foreach (#results) {
my $pic="$plabel $pavg $phigh $plow";
my $mark=$_->[0];
my $avg=sprintf($s_avg,$_->[1]);
my $high=sprintf($s_high,$_->[2]);
my $low=sprintf($s_low,$_->[3]);
print_line($pic,$mark,$avg,$high,$low);
}
print "\n\n";
Outputs this:
Label Rate - Operations / sec
Text Average High Low
---------- ------- ---- ---
Label 1 3.44 6.68e-05 0.03
Label 2 12.56 5.51e+01 1.11
Wide Label 3 1231.11 1.56e+03 66.66
Notice that the width of the columns is set based on the width of the data as formatted by the sprintf format string. You can then left, center, right justify that result. The "Low" data column is left justified, the rest of the data are right justified. You can change this by the symbol used in the scalar $plow and it is the same as format syntax. The labels at the top are centered and the "Rate - Operations / sec" label spans 3 columns.
This is obviously not "production ready" code, but you get the drift I think. You would need to further check the total width of the columns against desired width, etc. You have to manually do some of the work that format does for you, but you have far more flexibility with this approach. It is very easy to use this method for several sections of a line with sprintf for example.
Cheers.