Using big numbers in Perl - perl

I have a scenario where I take 2 very big binary strings (having 100 characters) and I need to add them.
The issue is that I am getting the answer in the form 2.000xxxxxxxxxxe+2, whereas I need the precise answer, as another 100 character long string.
chomp($str1=<STDIN>);
chomp($str2=<STDIN>);
print "Str 1 is $str1\n";
print "Str 2 is $str2\n";
$t = $str1 + $str2;
print "Sum is $t\n";
Sample Input
1001101111101011011100101100100110111011111011000100111100111110111101011011011100111001100011111010
1001101111101011011100101100100110111011111011000100111100111110111101011011011100111001100011111010
Sample Output
Str1 is
1001101111101011011100101100100110111011111011000100111100111110111101011011011100111001100011111010
Str2 is
1001101111101011011100101100100110111011111011000100111100111110111101011011011100111001100011111010
Sum is
2.0022022220202e+099

As already suggested, you can use Math::BigInt core module,
use Math::BigInt;
# chomp($str1=<STDIN>);
# chomp($str2=<STDIN>);
# print "Str 1 is $str1\n";
# print "Str 2 is $str2\n";
my $t = Math::BigInt->new("0b$str1") + Math::BigInt->new("0b$str2");
print $t->as_bin;

In order to perform arithmetic on your strings, Perl converts them to floating-point numbers, which are inherently imprecise. If you want to avoid that, use Math::BigInt as already suggested ... or roll your own.
######## WARNING/GUARANTEE: This is practically certain to be
# slower, buggier, less portable, and less secure than Math::BigInt.
# In fact, I planted a security hole just to prove a point. Enjoy.
use strict;
use warnings;
sub addition {
my ($int1, $int2) = #_;
my #int1 = reverse split //, $int1;
my #int2 = reverse split //, $int2;
my $len = scalar(#int1>#int2 ? #int1 : #int2);
my #result;
my $carry = 0;
for (my $i=0; $i < $len; ++$i)
{
$int1[$i] |= 0;
$int2[$i] |= 0;
my $sum = $carry + $int1[$i] + $int2[$i];
if ($sum >= 10)
{
$carry = int($sum / 10);
$sum %= 10;
}
push #result, $sum;
}
push #result, $carry if $carry;
return join ('', reverse #result);
}

Related

Count Characters in Perl

I need to count the letter "e" in the string
$x="012ei ke ek ek ";
So far, I've tried with a for-loop:
$l=length($x);
$a=0;
for($i=0;$i<$l;$i++)
{$s=substr($x,$i,1);
if($s=="e")
{$a++;}
print $a;
Your code has some problems. You forgot to close the for loop brace,
and in Perl == is supposed to compare numbers. Use eq for strings.
It is also recommended that you use warnings and enable strict mode,
which would have helped you debugging this. In your case, since e
would be treated as 0, so the other one char substrings, 1 and 2
would be the only characters not equal to e when compared with ==. A
cleaned up version of your code could be written as:
use warnings;
use strict;
my $x = "012ei ke ek ek ";
my $l = length $x;
my $count = 0;
for(my $i = 0; $i < $l; $i++) {
my $s = substr($x, $i, 1);
$count++ if ($s eq "e");
}
print $count;
There are multiple ways to achieve this. You could use a match with a
group, which if global returns all the occurrences in list context.
Since you want the number, take this result in scalar context. You can
achieve this for example with:
my $count = () = $string =~ /(e)/g;
Or:
my $count = #{[ $string =~ /(e)/g ]}
Another way is to split the string into characters and grep those that
are e:
my $count = grep $_ eq 'e', split //, $string;
And probably the most compact is to use tr which returns the count of
characters in scalar context, although this does restrict this usage to
counting characters only:
my $count = $string =~ tr/e//;
You compare characters with the numeric operator (==) when you should use the string comparison eq. If you had used the warnings pragma you would have seen that.
You code should have looked like:
#!/usr/bin/env perl
use strict;
use warnings;
my $x = "012ei ke ek ek ";
my $l = length($x);
my $a = 0;
for ( my $i = 0; $i < $l; $i++ ) {
my $s = substr( $x, $i, 1 );
if ( $s eq "e" ) {
$a++;
}
}
print "$a\n";
Proper indentation and the use of the strict and warnings pragmas will avoid and/or catch unintentional, dumb errors.
A much more Perl-ish (and shorter) way to achieve your answer is:
perl -le '$x="012ei ke ek ek";#count=$x=~m/e/g;print scalar #count'
4
This matches globally and collects all the matches in list context. The scalar value of the list gives the number of occurrences you seek.
Another way is to use tr
perl -le '$x="012ei ke ek ek";print scalar $x=~tr/e//'
4
#sidyll Already mentioned what is the problem in your script and all of the possible ways, but TIMTOWTDI.
$x="012ei ke ek ek ";
my $count;
$count++ while($x=~/e/g);
print $count;

Scoping in Perl

As a biology student, I'm trying to extend my programming knowledge and I ran into a problem with Perl.
I'm trying to create a program that generates random DNA strings and performs analysis work on the generated data.
In the first part of the program, I am able to print out the strings stored in the array, but the second part I cannot retrieve all but one of the elements of the array.
Could this be part of the scoping rules of Perl?
#!usr/bin/perl
# generate a random DNA strings and print it to file specified by the user.
$largearray[0] = 0;
print "How many nucleotides for the string?\n";
$n = <>;
$mylong = $n;
print "how many strings?\n";
$numstrings = <>;
# #largearray =();
$j = 0;
while ( $j < $numstrings ) {
$numstring = ''; # start with the empty string;
$dnastring = '';
$i = 0;
while ( $i < $n ) {
$numstring = int( rand( 4 ) ) . $numstring; # generate a new random integer
# between 0 and 3, and concatenate
# it with the existing $numstring,
# assigning the result to $numstring.
$i++; # increase the value of $i by one.
}
$dnastring = $numstring;
$dnastring =~ tr/0123/actg/; # translate the numbers to DNA characters.
#print $dnastring;
#print "\n";
$largearray[j] = $dnastring; #append generated string to end of array
#print $largearray[j];
#print $j;
#IN HERE THERE ARE GOOD ARRAY VALUES
#print "\n";
$j++;
}
# ii will be used to continuously take the next couple of strings from largearray
# for LCS matching.
$mytotal = 0;
$ii = 0;
while ( $ii < $numstrings ) {
$line = $largearray[ii];
print $largearray[ii]; #CANNOT RETRIEVE ARRAY VALUES
print "\n";
$ii++;
#string1 = split( //, $line );
$line = $largearray[ii];
#print $largearray[ii];
#print "\n";
$ii++;
chomp $line;
#string2 = split( //, $line );
$n = #string1; #assigning a list to a scalar just assigns the
#number of elements in the list to the scalar.
$m = #string2;
$v = 1;
$Cm = 0;
$Im = 0;
$V[0][0] = 0; # Assign the 0,0 entry of the V matrix
for ( $i = 1; $i <= $n; $i++ ) { # Assign the column 0 values and print
# String 1 See section 5.2 of Johnson
# for loops
$V[$i][0] = -$Im * $i;
}
for ( $j = 1; $j <= $m; $j++ ) { # Assign the row 0 values and print String 2
$V[0][$j] = -$Im * $j;
}
for ( $i = 1; $i <= $n; $i++ ) { # follow the recurrences to fill in the V matrix.
for ( $j = 1; $j <= $m; $j++ ) {
# print OUT "$string1[$i-1], $string2[$j-1]\n"; # This is here for debugging purposes.
if ( $string1[ $i - 1 ] eq $string2[ $j - 1 ] ) {
$t = 1 * $v;
}
else {
$t = -1 * $Cm;
}
$max = $V[ $i - 1 ][ $j - 1 ] + $t;
# print OUT "For $i, $j, t is $t \n"; # Another debugging line.
if ( $max < $V[$i][ $j - 1 ] - 1 * $Im ) {
$max = $V[$i][ $j - 1 ] - 1 * $Im;
}
if ( $V[ $i - 1 ][$j] - 1 * $Im > $max ) {
$max = $V[ $i - 1 ][$j] - 1 * $Im;
}
$V[$i][$j] = $max;
}
} #outer for loop
print $V[$n][$m];
$mytotal += $V[$n][$m]; # append current result to the grand total
print "\n";
} # end while loop
print "the average LCS value for length ", $mylong, " strings is: ";
print $mytotal/ $numstrings;
This isn't a scoping issue. You have declared none of your variables, which has the effect of implicitly making them all global and accessible everywhere in your code
I reformatted your Perl program so that I could read it, and then added this to the top of your program
use strict;
use warnings 'all';
which are essential in every Perl program you write
Then I added
no strict 'vars';
which is a very bad idea, and lets you get away without declaring any variables
The result is this
Argument "ii" isn't numeric in array element at E:\Perl\source\dna.pl line 60.
Argument "ii" isn't numeric in array element at E:\Perl\source\dna.pl line 61.
Argument "ii" isn't numeric in array element at E:\Perl\source\dna.pl line 67.
Argument "j" isn't numeric in array element at E:\Perl\source\dna.pl line 42.
Bareword "ii" not allowed while "strict subs" in use at E:\Perl\source\dna.pl line 60.
Bareword "ii" not allowed while "strict subs" in use at E:\Perl\source\dna.pl line 61.
Bareword "ii" not allowed while "strict subs" in use at E:\Perl\source\dna.pl line 67.
Bareword "j" not allowed while "strict subs" in use at E:\Perl\source\dna.pl line 42.
Execution of E:\Perl\source\dna.pl aborted due to compilation errors.
Line 42 (of my reformatted version) is
$largearray[j] = $dnastring
and lines 60, 61 and 67 are
$line = $largearray[ii];
print $largearray[ii]; #CANNOT RETRIEVE ARRAY VALUES
and
$line = $largearray[ii];
You are using j and ii as array indexes. Those are Perl subroutine calls, not variables. Adding use strict would have stopped this from compiling unless you had also declared sub ii and sub j
You might get away with it if you just change j and ii to $j and $ii, but you are certain to get into further problems
Please make the same changes to your own code, and declare every variable that you need using my as close as possible to the first place they are used
You should also improve your variable naming. Things like #largearray are pointless: the # says that it's an array, and whether it's large or not is relative, and of little use in understanding your code. If you have no better description of its purpose then #table or #data are probably a little better
Likewise, please avoid capital letters and most single-letter names. #V, $Cm and $Im are meaningless, and you would need fewer comments if those names were better
You certainly wouldn't need comments like # end while loop and # outer for loop if you had indented your blocks properly and kept them short enough so that both the beginning and the end can be seen on the screen at the same time, and the fewer comments you can get away with the better, because they badly clutter the code structure
Finally, it's worth noting that the C-style for loop is rarely the best choice in Perl. Your
for ( $i = 1; $i <= $n; $i++ ) { ... }
is much clearer as
for my $i ( 1 .. $n ) { ... }
and declaring the control variable at that point makes it unnecessary to invent new names like $ii for each new loop
I think you have a typo in your code:
ii => must be $ii
don't forget to put this at the beginning of your code:
use strict;
use warnings;
in order to avoid this (and others) kind of errors

Looking for a perl scripts to make addition and pick last number

I am trying to write a simple perl script to learn Perl. This is the first script I have written using user input. The script needs to get a last numbers after make addition function Any help would be appreciated. Below is what I have so far.
Example user input 9423 and then the scripts were make addition function like below
09+04=13
04+02=06
02+03=05
03+09=12
print "Enter 4 Digits Number";#9423
chomp($number = <STDIN>);
EDIT
How to pick a last 2 digits numbers so the results are 3652
#!/usr/bin/perl
my #nums = ("9423" =~ /(\d{1})/g);
my $a = $nums[0];
my $b = $nums[1];
my $c = $nums[2];
my $d = $nums[3];
my $ab= $a+$b;
my $bc= $b+$c;
my $cd= $c+$d;
my $da= $d+$a;
printf "%02d\n%02d\n%02d\n%02d\n", $ab, $bc, $cd, $da;
I think, here code will output your expected result
#!/usr/bin/perl
use warnings;
use strict;
print "Enter 4 Digits Number:";#9423
chomp(my $number = <STDIN>);
my #digits = split("", $number);
my #lasts;
# add first digit to the last position
#digits = 94239
$digits[$#digits + 1] = $digits[0];
for(my $i = 0; $i < $#digits; $i++){
$lasts[$i] = ($digits[$i] + $digits[$i + 1]) % 10;
}
print join "",#lasts,"\n";
output after enter number 9423:
3652
probably easiest way to do it is use a for loop and substring
EDIT (now tested):
for my $i (0 .. length($number)-2) {
print substr($number, $i, 1) + substr($number, $i+1, 1);
}
Perhaps the following will be helpful:
use strict;
use warnings;
print "Enter a 4 Digit Number: ";
chomp( my $num = <STDIN> );
my #nums = split //, $num;
$nums[ $#nums + 1 ] = $nums[0];
( $nums[$_] + $nums[ $_ + 1 ] ) =~ /(.)$/ and print $1 for 0 .. $#nums - 1;
Output after entering 9423:
3652

Perl Recursion and Functions

Having heard about Perl for year I decided to give it a few hours of my time to see how much I could pick up. I got through the basics fine and then got to loops. As a test I wanted to see if I could build a script to recurse through all alphanumerical values of up to 4 characters. I had written a PHP code that did the same thing some time ago so I took the same concept and used it. However when I run the script it puts "a" as the first 3 values and then only loops through the last digit. Anyone see what I am doing wrong?
#!/usr/local/bin/perl
$chars = "abcdefghijklmnopqrstuvwxyz";
$chars .= "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
$chars .= "0123456789";
#charset = split(//, $chars);
$charset_length = scalar(#charset);
sub recurse
{
($width, $position, $base_string) = #_;
for ($i = 0; $i < $charset_length; ++$i) {
$base = $base_string . $charset[$i];
if ($position < $width - 1) {
$pos = $position + 1;
recurse($width, $pos, $base);
}
print $base;
print "\n";
}
}
recurse(4, 0, '');
This is what I get when I run it:
aaaa
aaab
aaac
aaad
aaae
aaaf
aaag
aaah
aaai
aaaj
aaak
aaal
aaam
aaan
aaao
aaap
aaaq
aaar
aaas
aaat
aaau
aaav
aaaw
aaax
aaay
aaaz
aaaA
aaaB
aaaC
aaaD
aaaE
aaaF
aaaG
aaaH
aaaI
aaaJ
aaaK
aaaL
aaaM
aaaN
aaaO
aaaP
aaaQ
aaaR
aaaS
aaaT
aaaU
aaaV
aaaW
aaaX
aaaY
aaaZ
aaa0
aaa1
aaa2
aaa3
aaa4
aaa5
aaa6
aaa7
aaa8
aaa9
aaa9
aaa9
aaa9
You've been bitten by non strict scoping, this code does what it should (note the use strict at the top and the subsequent use of my to guarantee variable scoping).
#!/usr/bin/env perl
use strict;
use warnings;
my $chars = "abcdefghijklmnopqrstuvwxyz";
$chars .= "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
$chars .= "0123456789";
my #charset = split(//, $chars);
my $charset_length = scalar(#charset);
sub recurse {
my ($width, $position, $base_string) = #_;
for (my $i = 0; $i < $charset_length; ++$i) {
my $base = $base_string . $charset[$i];
if ($position < $width - 1) {
my $pos = $position + 1;
recurse($width, $pos, $base);
}
print $base;
print "\n";
}
}
recurse(4, 0, '');
Already well answered, but a more idiomatic approach would be:
use strict;
use warnings;
sub recurse {
my ($width, $base_string, $charset) = #_;
if (length $base_string) {
print "$base_string\n";
}
if (length($base_string) < $width) {
$recurser->($base_string . $_) for #$charset;
}
}
my #charset = ('a'..'z', 'A'..'Z', '0'..'9');
recurse(4, '', \#charset);
There's no need to pass position; it's implicit in the width of the base string passed in. The charset, on the other hand, should be passed in rather than having the subroutine use an external variable.
Alternatively, since the width and character set stay constant, generate a closure that references them:
use strict;
use warnings;
sub make_recurser {
my ($width, $charset) = #_;
my $recurser;
$recurser = sub {
my ($base_string) = #_;
if (length $base_string) {
print "$base_string\n";
}
if (length($base_string) < $width) {
$recurser->($base_string . $_) for #$charset;
}
}
}
my #charset = ('a'..'z', 'A'..'Z', '0'..'9');
my $recurser = make_recurser(4, \#charset);
$recurser->('');
Alternatively, just:
print "$_\n" for glob(('{' . join(',', 'a'..'z', 'A'..'Z', '0'..'9') . '}') x 4);
It has to do with the scope of the variables, you're still changing the same vars when you're calling the recursion. The keyword 'my' declares the variables local to the subroutine.
(http://perl.plover.com/FAQs/Namespaces.html)
I always use perl with 'use strict;' declared, forcing me to decide on the scope of the variables.
sub recurse {
my ($width, $position, $base_string) = #_;
for (my $i = 0; $i < $charset_length; ++$i) {
my $base = $base_string . $charset[$i];
if ($position < $width - 1) {
my $pos = $position + 1;
recurse($width, $pos, $base);
}
print $base;
print " ";
}
}
You seem to be running into some scoping issues. Perl is very flexible, so it is taking a guess at what you want because you haven't told it what you want. One of the first things you'll learn is to add use strict; as for your first statement after the shebang. It will point out the variables that are not being explicitly defined, as well as any variables that are accessed before being created (helps with misspelled variables, etc).
If you make your code look like this, you'll see why you are getting your errors:
sub recurse {
($width, $position, $base_string) = #_;
for ($i = 0; $i < $charset_length; ++$i) {
$base = $base_string . $charset[$i];
if ($position < $width - 1) {
$pos = $position + 1;
recurse($width, $pos, $base);
}
# print "$base\n";
}
print "$position\n";
}
This should output:
3
3
3
3
Because you are not scoping $position correctly with my, you aren't getting a new variable each recurse, you are re-using the same one. Toss a use strict; in there, and fix the errors you get, and the code should be good.
I realize that you're just tinkering with recursion. But as long as you're having fun comparing implementations between two languages you may as well also see how the CPAN can extend your tool set.
If you don't care about the order, you can generate all 13,388,280 permutations of ( 'a'..'z', 'A..'Z', '0'..'9' ) taken four at a time with the CPAN module, Algorithm::Permute
Here is an example of how that code may look.
use strict;
use warnings;
use Algorithm::Permute;
my $p = Algorithm::Permute->new(
[ 'a' .. 'z', 'A' .. 'Z', '0' .. '9' ], # Set of...
4 # <---- at a time.
);
while ( my #res = $p->next ) {
print #res, "\n";
}
The new() method accepts an array ref that enumerates the character set or list of what to permute. Its second argument is how many at a time to include in the permutation. So you're essentially taking 62 items 4 at a time. Then use the next() method to iterate through the permutations. The rest is just window dressing.
The same thing could be reduced to the following Perl one-liner:
perl -MAlgorithm::Permute -e '$p=Algorithm::Permute->new(["a".."z","A".."Z",0..9],4);print #r, "\n" while #r=$p->next;'
There is also a section on permutation, along with additional examples in perlfaq4. It includes several examples and lists some additional modules that handle the details for you. One of Perl's strengths is the size and completeness of the Comprehensive Perl Archive Network (the CPAN).

How can I improve this commify routine for speed?

I need an efficient commify filter or routine for use with Template::Toolkit. It is to be used many times on the page. It should support decimals.
This one is found in The Perl Cookbook:
sub commify {
my $text = reverse $_[0];
$text =~ s/(\d\d\d)(?=\d)(?!\d*\.)/$1,/g;
return scalar reverse $text;
}
Are there more efficient ways?
Before you try to optimize anything, be sure its actually a problem. Use a profiler to find the problem areas in your code and focus on those areas.
That commify solution is about as good as you can get, but there are other things you might do if you need to bypass it:
Use something like Memoize to cache results if you are commifying the same numbers repeatedly
Pre-compute all the numbers if they are unlikely to change.
Cache processed templates when you can
Use a reverse proxy setup with your webserver to hand off heavy processing to backend servers.
I think that if you are worried about speed of this - you are seriously misplacing your worries.
The commify function works on my desktop 130000 times per second on number like: "31243245356.4432".
this means that if you have on your page 10000 numbers, commification of it will take 76ms. And template toolkit processing of the page will probably take 2-3 times as long.
Another option is:
sub commify {
my $text = shift;
1 while $text =~ s/ ( \d ) ( \d{3} ) (\D|\z) /$1,$2$3/xms;
return $text;
}
When it comes to deciding which is faster, the Benchmark module is very useful.
#!/usr/bin/perl
use strict;
use warnings;
use Benchmark;
sub your_commify {
my $text = reverse 100000000;
$text =~ s/(\d\d\d)(?=\d)(?!\d*\.)/$1,/g;
return scalar reverse $text;
}
sub my_commify {
my $text = 100000000;
1 while $text =~ s/ ( \d ) ( \d{3} ) (\D|\z) /$1,$2$3/xms;
return $text;
}
timethese(
-10,
{
'yours' => \&your_commify,
'mine' => \&my_commify,
}
);
Runing this gives:
~$ ./benchmark.pl
Benchmark: running mine, yours for at least 10 CPU seconds...
mine: 10 wallclock secs (10.01 usr + 0.01 sys = 10.02 CPU) # 111456.89/s (n=1116798)
yours: 11 wallclock secs (10.04 usr + 0.00 sys = 10.04 CPU) # 250092.33/s (n=2510927)
Looks like yours is ~2.25 times faster! (When using the "at least 10 CPU seconds" mode you have to check the values of "n" used.)
So it looks like you have to keep searching... but remember to use Benchmark!
sub commify {
my $n = $_[0];
my $s = abs($n) != $n;
my $x = index($n, '.');
$x = length($n) if $x == -1;
substr($n, $x, 0, ',') while ($x -= 3) > $s;
return $n;
}
I agree with brian and depesz: from a practical point of view, this function is probably not the place to start if you're trying to improve the performance of your app. That said, it is possible to write a much faster commify function. One way is to avoid regular expressions.
use strict;
use warnings;
use Benchmark qw(cmpthese);
sub commify_regex {
my $text = reverse $_[0];
$text =~ s{(\d\d\d)(?=\d)(?!\d*\.)}{$1,}g;
return scalar reverse $text;
}
sub commify_substr {
my $v = $_[0];
my $len = length $v;
my $dec = index($v, '.');
my $i = 3 + ($dec < 0 ? 0 : $len - $dec);
$len -- unless $v == abs($v);
while ($i < $len ++){
substr($v, -$i, 0, ',');
$i += 4;
}
return $v;
}
my #tests = qw(
1 12 123 1234 12345 123456 1234567
12345678 123456789 1234567890 12345678901
123456789012 1234567890123
);
push #tests, map "$_.$_", #tests;
push #tests, map - $_, #tests;
for my $t (#tests){
print "Incorrect for: ", $t, "\n"
unless commify_substr($t) eq commify_regex($t);
}
cmpthese( -2, {
regex => sub { commify_regex($_) for #tests },
substr => sub { commify_substr($_) for #tests },
});
# Output on my Windows machine.
Rate regex substr
regex 3277/s -- -54%
substr 7109/s 117% --
This can be done using single regexp:
$number =~ s/(?<=\d)(?=(?:\d\d\d)+(?!\d))/,/g;