Parse scientific integer representation in perl - perl

What is the most elegant way to parse an integer given in scientific representation, i.e. I have an input file with lines like
value=1.04738e+06
Sure I can match the all the components (leading digit, decimal positions, exponent) and calculate the result, but it seems to me there is a more straight-forward way.

% perl -e 'print "1.04738e+06" + 0'
1047380
You just need to coerce it to a number and Perl will DWIM.

FYI: looks_like_number() from Scalar::Util might come in handy.
#!/usr/bin/env perl
use strict;
use warnings;
use Scalar::Util qw( looks_like_number );
my $line = "value=1.04738e+06";
my ( $tag, $value ) = split /\s*=\s*/, $line, 2;
if( looks_like_number( $value ) ){
$value = 0 + $value;
}
print "$tag=$value\n";

Related

Multiplying floats and ints in perl through an if statement

My goal is to utilize perl to multiply a float and an int, I have got this far and am still researching, many thanks to any help.
#!/usr/bin/perl
$float1 = 0.90
print "give me an integer";
$that_integer = <>;
if ($that_integer<=5000) {
print "$that_integer * $float1";
}
Welcome to Perl. A few tips:
Always include use strict; and use warnings; at the top of EVERY Perl script.
chomp your input from <STDIN> to remove the newline at the end.
You can't interpolate expressions. However, you can easily include them in a string easily using printf.
As demonstrated:
#!/usr/bin/perl
use strict;
use warnings;
my $float1 = 0.90;
print "give me an integer: ";
chomp( my $that_integer = <> );
if ( $that_integer <= 5000 ) {
printf "%f\n", $that_integer * $float1;
}
Arbitrary expressions can't be interpolated into double-quotes. Try:
print $that_integer * $float1, "\n";
The perlop documentation page includes all the gory details of parsing quoted constructs.

Converting to unicode characters in Perl?

I want to convert the text ( Hindi ) to Unicode in Perl. I have searched in CPAN. But, I could not find the exact module/way which I am looking for. Basically, I am looking for something like this.
My Input is:
इस परीक्षण के लिए है
My expected output is:
\u0907\u0938\u0020\u092a\u0930\u0940\u0915\u094d\u0937\u0923\u0020\u0915\u0947\u0020\u0932\u093f\u090f\u0020\u0939\u0948
How to achieve this in Perl?
Give me some suggestions.
Try this
use utf8;
my $str = 'इस परीक्षण के लिए है';
for my $c (split //, $str) {
printf("\\u%04x", ord($c));
}
print "\n";
You don't really need any module to do that. ord for extracting char code and printf for formatting it as 4-numbers zero padded hex is more than enough:
use utf8;
my $str = 'इस परीक्षण के लिए है';
(my $u_encoded = $str) =~ s/(.)/sprintf "\\u%04x", ord($1)/sge;
# \u0907\u0938\u0020\u092a\u0930\u0940\u0915\u094d\u0937\u0923\u0020\u0915\u0947\u0020\u0932\u093f\u090f\u0020\u0939\u0948
Because I left a few comments on how the other answers might fall short of the expectations of various tools, I'd like to share a solution that encodes characters outside of the Basic Multilingual Plane as pairs of two escapes: "😃" would become \ud83d\ude03.
This is done by:
Encoding the string as UTF-16, without a byte order mark. We explicitly choose an endianess. Here, we arbitrarily use the big-endian form. This produces a string of octets (“bytes”), where two octets form one UTF-16 code unit, and two or four octets represent an Unicode code point.
This is done for convenience and performance; we could just as well determine the numeric values of the UTF-16 code units ourselves.
unpacking the resulting binary string into 16-bit integers which represent each UTF-16 code unit. We have to respect the correct endianess, so we use the n* pattern for unpack (i.e. 16-bit big endian unsigned integer).
Formatting each code unit as an \uxxxx escape.
As a Perl subroutine, this would look like
use strict;
use warnings;
use Encode ();
sub unicode_escape {
my ($str) = #_;
my $UTF_16BE_octets = Encode::encode("UTF-16BE", $str);
my #code_units = unpack "n*", $UTF_16BE_octets;
return join '', map { sprintf "\\u%04x", $_ } #code_units;
}
Test cases:
use Test::More tests => 3;
use utf8;
is unicode_escpape(''), '',
'empty string is empty string';
is unicode_escape("\N{SMILING FACE WITH OPEN MOUTH}"), '\ud83d\ude03',
'non-BMP code points are escaped as surrogate halves';
my $input = 'इस परीक्षण के लिए है';
my $output = '\u0907\u0938\u0020\u092a\u0930\u0940\u0915\u094d\u0937\u0923\u0020\u0915\u0947\u0020\u0932\u093f\u090f\u0020\u0939\u0948';
is unicode_escape($input), $output,
'ordinary BMP code points each have a single escape';
If you want only an simple converter, you can use the following filter
perl -CSDA -nle 'printf "\\u%*v04x\n", "\\u",$_'
#or
perl -CSDA -nlE 'printf "\\u%04x",$_ for unpack "U*"'
like:
echo "इस परीक्षण के लिए है" | perl -CSDA -ne 'printf "\\u%*v04x\n", "\\u",$_'
#or
perl -CSDA -ne 'printf "\\u%*v04x\n", "\\u",$_' <<< "इस परीक्षण के लिए है"
prints:
\u0907\u0938\u0020\u092a\u0930\u0940\u0915\u094d\u0937\u0923\u0020\u0915\u0947\u0020\u0932\u093f\u090f\u0020\u0939\u0948\u000a
Unicode with surrogate pairs.
use strict;
use warnings;
use utf8;
use open qw(:std :utf8);
my $str = "if( \N{U+1F42A}+\N{U+1F410} == \N{U+1F41B} ){ \N{U+1F602} = \N{U+1F52B} } # ορισμός ";
print "$str\n";
for my $ch (unpack "U*", $str) {
if( $ch > 0xffff ) {
my $h = ($ch - 0x10000) / 0x400 + 0xD800;
my $l = ($ch - 0x10000) % 0x400 + 0xDC00;
printf "\\u%04x\\u%04x", $h, $l;
}
else {
printf "\\u%04x", $ch;
}
}
print "\n";
prints
if( 🐪+🐐 == 🐛 ){ 😂 = 🔫 } # ορισμός
\u0069\u0066\u0028\u0020\ud83d\udc2a\u002b\ud83d\udc10\u0020\u003d\u003d\u0020\ud83d\udc1b\u0020\u0029\u007b\u0020\ud83d\ude02\u0020\u003d\u0020\ud83d\udd2b\u0020\u007d\u0020\u0023\u0020\u03bf\u03c1\u03b9\u03c3\u03bc\u03cc\u03c2\u0020

How to print in decimal form rather than exponential form in perl

I have written a program in perl.My requirement is to print the only the decimal numbers, not exponential numbers. Could you please let me know how to implement this ?
My program is calculating the expression 1/2 power(n) , where n can take up integer numbers from 1 to 200 only. And only 100 lines should be printed.
Example:
N=1, print 0.5
N=2, print 0.25
My program looks like:
#!/usr/bin/perl
use strict;
use warnings;
my $exp;
my $num;
my $count_lines = 0;
while($exp = <>)
{
next if($exp =~ m/^$/);
if($exp > 0 and $exp <=200 and $count_lines < 100)
{
$num = 1/(2 ** $exp);
print $num,"\n";
$count_lines++;
}
}
Input values:
If N = 100 , then out is getting printed in exponential form. But, the requirement is it should get printed in decimal form.
A simple print will pick the "best" format to display the value, so it chooses scientific format for very large or very small numberss to avoid printing a long string of zeroes.
But you can use printf (the format specifiers are documented here) to format a number however you want.
0.5200 is a very small number, so you need around 80 decimal places
use strict;
use warnings;
while (my $exp = <>) {
next unless $exp =~ /\S/;
my $count_lines = 0;
if ($exp > 0 and $exp <= 200 and $count_lines < 100) {
my $num = 1 / (2 ** $exp);
printf "%.80f\n", $num;
$count_lines++;
}
}
output for 100
0.00000000000000000000000000000078886090522101181000000000000000000000000000000000
and for 200
0.00000000000000000000000000000000000000000000000000000000000062230152778611417000
If you would like to remove insignificant trailing zeroes then you can use sprintf to put the formatted number into a variable and then use s/// to delete trailing zeroes, like this
my $number = sprintf "%.80f", $num;
$number =~ s/0+$//;
print $number, "\n";
which gives
0.00000000000000000000000000000078886090522101181
and
0.00000000000000000000000000000000000000000000000000000000000062230152778611417
Note that the true value of the calculation has many more digits than this, and the accuracy of the result is limited by the size of the floating point values that your computer uses.
0.5 ^ 200 is too small for a double floating point number, you need to use Math::BigFloat, that will overload basic math operations and output operators such as print for you, for example:
#!/usr/bin/perl
use strict;
use warnings;
use Math::BigFloat;
my $x = Math::BigFloat->new('0.5');
my $y = Math::BigFloat->new('200');
print $x ** $y, "\n";
Or use bignum:
#!/usr/bin/perl
use strict;
use warnings;
use bignum;
print 0.5 ** 200, "\n";
Output:
$ perl t.pl
0.00000000000000000000000000000000000000000000000000000000000062230152778611417071440640537801242405902521687211671331011166147896988340353834411839448231257136169569665895551224821247160434722900390625
You can use printf or sprintf to specify the format of what you want to print out.
#!/usr/bin/perl
use strict;
use warnings;
my $num = 0.000000123;
printf("%.50", $num)
If you need something like Perl 5 formats, take a look at Perl6::Form (note, this is a Perl 5 module, it just implements the proposed Perl 6 version of formats).

Why do these two methods to determine the number of print columns behave differently?

With these Unicode ranges Unicode::GCString'scolumns returns the number of print columns while mbswidth from Text::CharWidth doesn't.
To they behave differently because they use different databases?
#!/usr/bin/env perl
use warnings;
use strict;
use open qw(:std :utf8);
use Text::CharWidth qw(mbswidth); # 0.04
use Unicode::GCString; # 2012.10
for my $hex ( 0x0378 .. 0xd7ff, 0xfa2e .. 0xfdcf, 0xfdfe .. 0xfff8 ) {
my $chr = chr $hex;
if ( mbswidth( $chr ) == -1 ) { # -1 invalid data
my $gcs = Unicode::GCString->new( $chr );
my $width = $gcs->columns;
printf "%04x - %d : %s\n", $hex, $width, $chr;
}
}
Text::CharWidth uses the C library function wcwidth which depends on the OS and current locale. Unicode::GCString uses the sombok library. The latter seems to be regularly updated to the latest Unicode versions, so I'd consider it to be accurate.

Calculate 100 factorial with all the digits

I came across a problem of calculating 100 factorial.
Here is what I tried first in Perl to calculate 100! :
#!/usr/bin/perl
use strict;
use warnings;
use Math::BigInt;
my $n=<>;
chomp($n);
print fac($n);
sub fac
{
my ($m) = #_;
return 1 if($m <=1 );
return $m*fac($m-1);
}
But this is giving me 9.33262154439441e+157.
I need the answer with all of the digits.
What do I do?
Doubles (which most Perls use) only have ~16 digits of precision. You need to use another system to get the 158 digits of precision you need.
use bigint;
This will cause Perl to automatically treat all numbers in your script as Math::BigInt objects.
If you need finer control (to treat some numbers as BigInt and some numbers as floating point) then see Krishnachandra Sharma's solution and explicitly use the Math::BigInt constructor.
Math::BigInt has a builtin factorial function, by the way:
$ perl -MMath::BigInt -e 'print Math::BigInt->bfac(100)'
93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000
Doubles (which most Perls use) only have ~16 digits of precision. You need to another system to get the 158 digits of precision you need. Try using Math::BigInt.
Here is the code.
#!/usr/bin/perl
use strict;
use warnings;
use Math::BigInt;
my $n=100;
Math::BigInt->new($n);
print fac($n);
sub fac
{
my ($m) = #_;
return 1 if($m <=1 );
return Math::BigInt->new($m*fac($m-1));
}
Produces 9332621544394415268169923e266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000
By definition, bigint works by overloading handling of integer and floating point literals, converting them to Math::BigInt objects. So with the help of simple for loop we can achieve the factorial of very big integers.
use bigint;
my $fact = 1;
for my $n (1..100) {
$fact *= $n;
}
print "Factorial: \n", $fact , "\n";
This produces the below output:
Factorial: 933262154439441526816992388562667004907159682643816214685929638952175
99993229915608941463976156518286253697920827223758251185210916864000000000000000
000000000
whereas the normal program like this would tremble with no meaningful output
use integer;
my $fact = 1;
for my $n (1..100) {
$fact *= $n;
}
print "Factorial: \n", $fact , "\n";
Output:
Factorial:
0