How to find range - range

I have two sets of ranges. Each range is a pair of integers (start and end) representing some sub-range of a single larger range.
I need to determine which ranges from set A overlap with which ranges from set B.

I think your input got truncated, because I can't see any way to get the last rows of your expected output.
But, for the portion that's there, I think you want a script like this:
my %cover;
foreach my $line ( <STDIN> )
{
chomp $line;
my ( $tag, $lo, $hi ) = split /\s+/, $line;
map { $cover{$_}++ } ($lo .. $hi);
}
my $beg = 0;
my $end = 0;
my $cnt = 0;
foreach my $val ( sort { $a <=> $b } keys %cover )
{
if ($cover{$val} != $cnt || $val > $end + 1)
{
if ($cnt > 0)
{
print "chr1\t$beg\t$end\t$cnt\n";
}
$cnt = $cover{$val};
$beg = $val;
$end = $val;
} else
{
$end = $val;
}
}
if ($cnt > 0)
{
print "chr1\t$beg\t$end\t$cnt\n";
}
You didn't tell us, though, what to do with chr1 or how to related it between the input and output (are there other values that could appear there, for example?) so I just hardcoded that in the output. You'll have to change that part appropriately.
Also, my script outputs slightly different ranges than your "expected output," specifically where two ranges abut. My script, for example, outputs
chr1 556 579 1
chr1 580 592 2
but your expected output gives 580 instead of 579 for the first line. I'm not sure your expected output is correct. If you really did want 580 there (which doesn't make a lot of sense), then you could modify the script above to output $end+1 when $val == $end+1. That just seems weird though.
Here's that modified version of the code that gives the weird behavior when ranges abut:
my %cover;
foreach my $line ( <STDIN> )
{
chomp $line;
my ( $tag, $lo, $hi ) = split /\s+/, $line;
map { $cover{$_}++ } ($lo .. $hi);
}
my $beg = 0;
my $end = 0;
my $cnt = 0;
foreach my $val ( sort { $a <=> $b } keys %cover )
{
if ($cover{$val} != $cnt || $val > $end + 1)
{
## unusual value for '$end' when ranges abut.
$end = $val if ($val == $end + 1);
if ($cnt > 0)
{
print "chr1\t$beg\t$end\t$cnt\n";
}
$cnt = $cover{$val};
$beg = $val;
$end = $val;
} else
{
$end = $val;
}
}
if ($cnt > 0)
{
print "chr1\t$beg\t$end\t$cnt\n";
}

Related

Caesar Cipher -- "Isn't numeric in addition (+)"

I'm creating a Caesar cipher using Perl, but I cant seem to find the error in the code.
I keep getting the error message:
Argument "hello" isn't numeric in addition (+) at ./Lab03.pl line 66, <> line 1.
which is the line $translated += $symbol.
use warnings;
$x = 26;
sub getMode {
$e = "encrypt decrypt";
while ( 'True' ) {
print "Do you wish to encrypt or decrypt a message? \n";
$mode = <STDIN>;
chomp( $mode );
if ( $mode = split( //, $e ) ) {
return $mode;
}
else {
print "Enter either 'encrypt' or 'decrypt'.\n";
}
}
}
sub getMessage {
print "Enter your message:";
$input = <STDIN>;
chomp( $input );
return $input;
}
sub getKey {
$key = 0;
while ( 'True' ) {
print "Enter the key number (1-26): ";
$key = int( <> );
chomp( $key );
if ( $key >= 1 and $key <= $x ) {
return $key;
}
}
}
sub getTranslatedMessage {
( $mode, $message, $key ) = #_;
if ( $mode =~ /^d/ ) {
$key = -$key;
$translated = '';
}
foreach $symbol ( $message ) {
if ( $symbol =~ /[A-Za-z]/ ) {
$num = ord( $symbol );
$num += $key;
}
if ( $symbol =~ /^[A-Z]/ ) {
if ( $num > ord( 'Z' ) ) {
$num -= 26;
}
elsif ( $num < ord( 'A' ) ) {
$num += 26;
}
elsif ( $symbol = /^[a-z]/ ) {
if ( $num > ord( 'z' ) ) {
$num -= 26;
}
elsif ( $num < ord( 'a' ) ) {
$num += 26;
}
$translated += chr( $num );
}
}
else {
$translated += $symbol;
}
}
return $translated;
}
$mode = getMode();
$message = getMessage();
$key = getKey();
print "Your translated text is: '\n' ";
print( getTranslatedMessage( $mode, $message, $key ) );
In Perl, + is numeric addition only. String concatenation is . / .=.
Also:
if ($mode = split(//,$e)){
is incorrect. I believe you want something like:
my %valid_mode = ( 'encrypt' => 1, 'decrypt' => 1);
...
if ( $valid_mode{$mode} ) {
return $mode
The code you have is setting $mode into the number of characters in $e (in an inefficient way).
Here:
foreach $symbol ($message){
in Perl, strings are first class entities; they aren't automatically interpreted as arrays of characters. So to loop over the characters, you need to so something else. The simplest way is:
foreach $symbol ( split //, $message ) {
Here:
elsif ($symbol= /^[a-z]/){
= should be =~.
There is also a problem with which code is in which blocks that prevents upper case characters from being added to the output. It looks to me like the closing brace for your fir st if ($symbol =~ should be just before the later else, and other braces possibly fixed up to match.
Putting all your }'s on a line of their own, indented the same as the line with the corresponding { is a much better idea. It will help you see mismatched braces much more easily.
Here is corrected code, with use strict added and all variables declared:
use warnings;
use strict;
my $x = 26;
sub getMode{
my %valid_mode = ( 'encrypt' => 1, 'decrypt' => 1 );
while ('True'){
print"Do you wish to encrypt or decrypt a message? \n";
my $mode = <STDIN>;
chomp ( $mode);
if ($valid_mode{$mode}) {
return $mode;
}
else {
print "Enter either 'encrypt' or 'decrypt'.\n";
}
}
}
sub getMessage{
print"Enter your message:";
my $input = <STDIN>;
chomp ($input);
return $input;
}
sub getKey{
my $key = 0;
while ('True'){
print"Enter the key number (1-26): ";
$key = int(<>);
chomp ($key);
if ($key >= 1 and $key <= $x){
return $key;
}
}
}
sub getTranslatedMessage{
my ($mode, $message, $key) = #_;
if ($mode =~ /^d/){
$key = -$key;
}
my $translated = '';
foreach my $symbol (split //, $message){
if ($symbol =~ /[A-Za-z]/){
my $num = ord($symbol);
$num += $key;
if ($symbol =~ /^[A-Z]/){
if ($num > ord('Z')){
$num -= 26;
}
elsif ($num < ord('A')){
$num += 26;
}
}
elsif ($symbol=~ /^[a-z]/){
if ($num > ord('z')){
$num -= 26;
}
elsif ($num < ord('a')){
$num += 26;
}
}
$translated .= chr($num);
}
else{
$translated .= $symbol;
}
}
return $translated;
}
my $mode = getMode();
my $message = getMessage();
my $key = getKey();
print"Your translated text is:\n";
print(getTranslatedMessage($mode, $message, $key));
print "\n";
Over all, I suggest you write smaller chunks of code and test them to make sure they worked before assembling them all together.

PERL Script for discerning between cavity and void space in PDB(Protein Database) file

The problem with the following code is only in one function of the code. The problem function is with a comment head and close. This is my first post to StackOverflow so bear with me. The following script has some modules and other functions that I know work by testing them with the problem function commented out but I just cannot seem to get that one function to work. When ran, the script runs until the enviroment kills the execution.
Basically what this program does is takes a PDB file, copies everything out of the PDB file and creates a new one and pastes all of the original input file content into the new file and appends the cavities(coordinates of center of the cavity and the specified probe radius) that the program is supposed to find.
The problem function within the code is supposed to distinguish between a void space within a bound box of the structure and a cavity. Cavities are considered to be a closed space somewhere within the structure. A void space is any space or coordinate within the bounding box of max and min coorindates where there isn't an atom.The cavity must be large enough to fit into a specified probe radius. There is also a specified resolution when searching through the 3D hashtable of coordinates.
Can anyone tell me why my code isn't working. Anything immediate. I have tested and tested and cannot seem to find the error.
Thank you.
#!/usr/bin/perl
# Example 11-6 Extract atomic coordinates from PDB file
use strict;
use warnings;
use BeginPerlBioinfo; # see Chapter 6 about this module
#open file for printing
open(FH,">results.pdb");
open(PDB,"oneAtom.pdb");
while(<PDB>) { print FH $_; }
close(PDB);
# Read in PDB file
my #file = get_file_data('oneAtom.pdb');
# Parse the record types of the PDB file
my %recordtypes = parsePDBrecordtypes(#file);
# Extract the atoms of all chains in the protein
my %atoms = parseATOM ( $recordtypes{'ATOM'} );
#define some variables and get the atom indices stored in atom_numbers array
my #atom_numbers = sort {$a <=> $b} keys %atoms;
my $resolution = 4.;
my $lo = 1000;
my $hi = -1000;
my $p_rad = 1;
my %pass;
#set the grid boundaries
foreach my $l ( #atom_numbers ) {
for my $i (0..2) {
if ( $atoms{$l}[$i] < $lo ) { $lo = $atoms{$l}[$i]; }
if ( $atoms{$l}[$i] > $hi ) { $hi = $atoms{$l}[$i]; }
}
}
$lo = $lo - 2* $resolution;
$hi = $hi + 2* $resolution;
#compute min distance to the pdb structure from each grid point
for ( my $i = $lo ; $i <= $hi ; $i = $i + $resolution ) {
for ( my $j = $lo ; $j <= $hi ; $j = $j + $resolution ) {
for ( my $k = $lo ; $k <= $hi ; $k = $k + $resolution ) {
my $min_dist = 1000000;
foreach my $l ( #atom_numbers ) {
my $distance = sqrt((($atoms{$l}[0]-($i))*($atoms{$l}[0]-($i))) + (($atoms{$l}[1]-($j))*($atoms{$l}[1]-($j))) + (($atoms{$l}[2]-($k))*($atoms{$l}[2]-($k))));
$distance = $distance - ( $p_rad + $atoms{$l}[3] );
if ( $distance < $min_dist ) {
$min_dist = $distance;
}
}
$pass{$i}{$j}{$k} = $min_dist;
if ( $pass{$i}{$j}{$k} > 0 ) {
$pass{$i}{$j}{$k} = 1;
} else { $pass{$i}{$j}{$k} = 0;
}
}
}
}
#define a starting point on the outside of the grid and place first on list of points
#my #point = ();
my $num_cavities = 0;
#define some offsets used to compute neighbors
my %offset = (
1 => [-1*$resolution,0,0],
2 => [1*$resolution,0,0],
3 => [0,-1*$resolution,0],
4 => [0,1*$resolution,0],
5 => [0,0,-1*$resolution],
6 => [0,0,1*$resolution],
);
##########################################################
#function below with problem
##########################################################
my #point = ();
push #point,[$hi,$hi,$hi];
=pod
#do the following while there are points on the list
while ( #point ) {
foreach my $vector ( keys %offset ) { #for each offset vector
my #neighbor = (($point[0][0]+$offset{$vector}[0]),($point[0][1]+$offset{$vector}[1]),($point[0][2]+$offset{$vector}[2])); #compute neighbor point
if ( exists $pass{$neighbor[0]}{$neighbor[1]}{$neighbor[2]} ) { #see if neighbor is in the grid
if ( $pass{$neighbor[0]}{$neighbor[1]}{$neighbor[2]} == 1 ) { #if it is see if its further than the probe radius
push #point,[($point[0][0]+$offset{$vector}[0]),($point[0][1]+$offset{$vector}[1]),($point[0][2]+$offset{$vector}[2])]; #if it is push it onto the list of points
}
}
}
$pass{$point[0][0]}{$point[0][1]}{$point[0][2]} = 0; #eliminate the point just tested from the pass array
shift #point; #move to the next point in the list
}
=cut
##############################################################
# end of problem function
##############################################################
my $grid_ind = $atom_numbers[$#atom_numbers];
for ( my $i = $lo ; $i <= $hi ; $i = $i + $resolution ) {
for ( my $j = $lo ; $j <= $hi ; $j = $j + $resolution ) {
for ( my $k = $lo ; $k <= $hi ; $k = $k + $resolution ) {
if ( $pass{$i}{$j}{$k} == 1 ) {
$grid_ind = $grid_ind + 1;
my $n = sprintf("%5d",$grid_ind);
my $x = sprintf("%7.3f",$i);
my $y = sprintf("%7.3f",$j);
my $z = sprintf("%7.3f",$k);
my $w = sprintf("%6.3f",1);
my $p = sprintf("%6.3f",$p_rad);
print FH "ATOM $n MC CAV $n $x $y $z $w $p \n";
}
}
}
}
close(FH);
exit;
#do the following while there are points on the list
for ( my $i = $lo ; $i <= $hi ; $i = $i + $resolution ) {
for ( my $j = $lo ; $j <= $hi ; $j = $j + $resolution ) {
for ( my $k = $lo ; $k <= $hi ; $k = $k + $resolution ) {
if ( $pass{$i}{$j}{$k} == 1 ) {
push #point,[$i,$j,$k];
$num_cavities++;
while ( #point ) {
foreach my $vector ( keys %offset ) { #for each offset vector
my #neighbor = (($point[0][0]+$offset{$vector}[0]),($point[0][1]+$offset{$vector}[1]),($point[0][2]+$offset{$vector}[2])); #compute neighbor point
if ( exists $pass{$neighbor[0]}{$neighbor[1]}{$neighbor[2]} ) { #see if neighbor is in the grid
if ( $pass{$neighbor[0]}{$neighbor[1]}{$neighbor[2]} == 1 ) { #if it is see if its further than the probe radius
push #point,[($point[0][0]+$offset{$vector}[0]),($point[0][1]+$offset{$vector}[1]),($point[0][2]+$offset{$vector}[2])]; #if it is push it onto the list of points
}
}
}
$pass{$point[0][0]}{$point[0][1]}{$point[0][2]} = 0; #eliminate the point just tested from the pass array
shift #point; #move to the next point in the list
}
}
}
}
}
#print the results
print "\nthe structure has " . $num_cavities . " cavities.\n\n";
#print the point that are left over (these correspond to the cavities)
#for ( my $i = -10 ; $i <= 10 ; $i = $i + $resolution ) {
# for ( my $j = -10 ; $j <= 10 ; $j = $j + $resolution ) {
# for ( my $k = -10 ; $k <= 10 ; $k = $k + $resolution ) {
# print $i . "\t" . $j . "\t" . $k . "\t" . $pass{$i}{$j}{$k} . "\n";
# }
# }
#}
###################################################
# function
###################################################
sub parseATOM {
my($atomrecord) = #_;
use strict;
use warnings;
my %results = ( );
# Turn the scalar into an array of ATOM lines
my(#atomrecord) = split(/\n/, $atomrecord);
foreach my $record (#atomrecord) {
my $number = substr($record, 6, 5); # columns 7-11
my $x = substr($record, 30, 8); # columns 31-38
my $y = substr($record, 38, 8); # columns 39-46
my $z = substr($record, 46, 8); # columns 47-54
my $r = substr($record, 60, 6); # columns 47-54
#my $element = substr($record, 76, 2); # columns 77-78
# $number and $element may have leading spaces: strip them
$number =~ s/\s*//g;
#$element =~ s/\s*//g;
$x =~ s/\s*//g;
$y =~ s/\s*//g;
$z =~ s/\s*//g;
$r =~ s/\s*//g;
# Store information in hash
#$results{$number} = [$x,$y,$z,$element];
$results{$number} = [$x,$y,$z,$r];
}
# Return the hash
return %results;
}
Here's one thing that is almost certainly slowing things down:
$x =~ s/\s*//g;
$y =~ s/\s*//g;
$z =~ s/\s*//g;
$r =~ s/\s*//g;
It is possible for \s* to match an empty string, so you are replacing empty strings with empty strings, for each empty string in the target string.
Change to:
$x =~ s/\s+//g;
$y =~ s/\s+//g;
$z =~ s/\s+//g;
$r =~ s/\s+//g;
You have the following definitions:
my $lo = 1000;
my $hi = -1000;
So when you get to your first for loop, you will set $i to 1000, and then fail the check to see if it is less than -1000.

How do I speed up pattern recognition in perl

This is the program as it stands right now, it takes in a .fasta file (a file containing genetic code), creates a hash table with the data and prints it, however, it is quite slow. It splits a string an compares it against all other letters in the file.
use strict;
use warnings;
use Data::Dumper;
my $total = $#ARGV + 1;
my $row;
my $compare;
my %hash;
my $unique = 0;
open( my $f1, '<:encoding(UTF-8)', $ARGV[0] ) or die "Could not open file '$ARGV[0]' $!\n";
my $discard = <$f1>;
while ( $row = <$f1> ) {
chomp $row;
$compare .= $row;
}
my $size = length($compare);
close $f1;
for ( my $i = 0; $i < $size - 6; $i++ ) {
my $vs = ( substr( $compare, $i, 5 ) );
for ( my $j = 0; $j < $size - 6; $j++ ) {
foreach my $value ( substr( $compare, $j, 5 ) ) {
if ( $value eq $vs ) {
if ( exists $hash{$value} ) {
$hash{$value} += 1;
} else {
$hash{$value} = 1;
}
}
}
}
}
foreach my $val ( values %hash ) {
if ( $val == 1 ) {
$unique++;
}
}
my $OUTFILE;
open $OUTFILE, ">output.txt" or die "Error opening output.txt: $!\n";
print {$OUTFILE} "Number of unique keys: " . $unique . "\n";
print {$OUTFILE} Dumper( \%hash );
close $OUTFILE;
Thanks in advance for any help!
It is not clear from the description what is wanted from this script, but if you're looking for matching sets of 5 characters, you don't actually need to do any string matching: you can just run through the whole sequence and keep a tally of how many times each 5-letter sequence occurs.
use strict;
use warnings;
use Data::Dumper;
my $str; # store the sequence here
my %hash;
# slurp in the whole file
open(IN, '<:encoding(UTF-8)', $ARGV[0]) or die "Could not open file '$ARGV[0]' $!\n";
while (<IN>) {
chomp;
$str .= $_;
}
close(IN);
# not sure if you were deliberately omitting the last two letters of sequence
# this looks at all the sequence
my $l_size = length($str) - 4;
for (my $i = 0; $i < $l_size; $i++) {
$hash{ substr($str, $i, 5) }++;
}
# grep in a scalar context will count the values.
my $unique = grep { $_ == 1 } values %hash;
open OUT, ">output.txt" or die "Error opening output.txt: $!\n";
print OUT "Number of unique keys: ". $unique."\n";
print OUT Dumper(\%hash);
close OUT;
It might help to remove searching for information that you already have.
I don't see that $j depends upon $i so you're actually matching values to themselves.
So you're getting bad counts as well. It works for 1, because 1 is the square of 1.
But if for each five-character string you're counting strings that match, you're going
to get the square of the actual number.
You would actually get better results if you did it this way:
# compute it once.
my $lim = length( $compare ) - 6;
for ( my $i = 0; $i < $lim; $i++ ){
my $vs = substr( $compare, $i, 5 );
# count each unique identity *once*
# if it's in the table, we've already counted it.
next if $hash{ $vs };
$hash{ $vs }++; # we've found it, record it.
for ( my $j = $i + 1; $j < $lim; $j++ ) {
my $value = substr( $compare, $j, 5 );
$hash{ $value }++ if $value eq $vs;
}
}
However, it could be an improvement on this to do an index for your second loop
and let the c-level of perl do your matching for you.
my $pos = $i;
while ( $pos > -1 ) {
$pos = index( $compare, $vs, ++$pos );
$hash{ $vs }++ if $pos > -1;
}
Also, if you used index, and wanted to omit the last two characters--as you do, it might make sense to remove those from the characters you have to search:
substr( $compare, -2 ) = ''
But you could do all of this in one pass, as you loop through file. I believe the code
below is almost an equivalent.
my $last_4 = '';
my $last_row = '';
my $discard = <$f1>;
# each row in the file after the first...
while ( $row = <$f1> ) {
chomp $row;
$last_row = $row;
$row = $last_4 . $row;
my $lim = length( $row ) - 5;
for ( my $i = 0; $i < $lim; $i++ ) {
$hash{ substr( $row, $i, 5 ) }++;
}
# four is the maximum we can copy over to the new row and not
# double count a strand of characters at the end.
$last_4 = substr( $row, -4 );
}
# I'm not sure what you're getting by omitting the last two characters of
# the last row, but this would replicate it
foreach my $bad_key ( map { substr( $last_row, $_ ) } ( -5, -6 )) {
--$hash{ $bad_key };
delete $hash{ $bad_key } if $hash{ $bad_key } < 1;
}
# grep in a scalar context will count the values.
$unique = grep { $_ == 1 } values %hash;
You may be interested in this more concise version of your code that uses a global regex match to find all the subsequences of five characters. It also reads the entire input file in one go, and removes the newlines afterwards.
The path to the input file is expected as a parameter on the command line, and the output is sent to STDIN, and can be redirected to a file on the command line, like this
perl subseq5.pl input.txt > output.txt
I've also used Data::Dump instead of Data::Dumper because I believe it to be vastly superior. However it is not a core module, and so you will probably need to install it.
use strict;
use warnings;
use open qw/ :std :encoding(utf-8) /;
use Data::Dump;
my $str = do { local $/; <>; };
$str =~ tr|$/||d;
my %dups;
++$dups{$1} while $str =~ /(?=(.{5}))/g;
my $unique = grep $_ == 1, values %dups;
print "Number of unique keys: $unique\n";
dd \%dups;

How to batch delete foreign language in subtitle

I have some sample like this:
2
00:01:32,288 --> 00:01:33,208
¬O¥L­Ì¶Ü¡H
How are you?
3
00:01:36,768 --> 00:01:39,648
€Ñ°Ú¡A¥L­Ì¥ŽºâŽN³o»ò°µ¶Ü¡H
âŽN³o»ò°µ¶Ü¡H
I am fine
And you ?
--------------------Here is my solution but it's incomplete
#!/usr/bin/perl -w
$lineIndex = 0;
while($line=<>){
$lineIndex++; #line index start from 1
$content{$lineIndex}=$line; #copy to content
for($i = 0; $i < length ($line); $i++){
$char = substr $line,$i,1;
if($char =~ /\W/){
#print $char;
$count{$lineIndex}++; #how many special char this line
}
}
}
# if line contains more than 14 special char,then skip
print "\n";
for $i (keys %count){
if($count{$i} > 14){ #<----------------see here
delete $content{$i};#delete from content
}
}
for $j (sort keys %content){ #output
print $content{$j};
}
my solution has this problem:
���O�J�b�յۺ��X is miss match, because its length <= 14
if change threshold to small number eg.6 string like 00:01:33,208 will be matched, thus delete from content
Is there a good way to check char in utf-8 ?
Here's a much simpler solution:
while($line = <>) {
print $line unless $line =~ /[^\x00-\x7e]/;
}
The character set [\x00-\x7e] covers all basic ASCII characters (including control characters).
#!/usr/bin/perl -w
$lineIndex = 0;
$flag = 1;
while($line=<>){
$line = join('',$line);
$lineIndex++;
if($line =~ /\d\d:\d\d:\d\d,\d\d\d/){
print $line;
next;
}
for($i = 0; $i < length ($line); $i++){
$char = substr $line,$i,1;
if($char !~ /[\w ,.;!?'\-\r\n]/){
$flag = 0;
last;
}else{
$flag = 1;
}
}
if($flag == 1){
print $line;
}
}

Binary representation of float to decimal conversions in Perl

I read Stack Overflow question How do I convert a binary string to a number in Perl? on how to convert binary integers to decimal or vice versa in Perl. But how do I do this for float as well?
For example, conversion from 5.375 to 101.011 and vice versa.
sub number_to_binary_string {
my $in = shift;
my $sign = $in < 0 and $in = abs $in;
my $out = sprintf "%b.", int $in;
substr $out, 0, 0, '-' if $sign;
$in -= int $in;
do {
if ($in >= .5) {
$out .= '1';
$in -= .5;
}
else {
$out .= '0';
}
$in *= 2;
} while $in > 0;
return $out;
}
sub binary_string_to_number {
my $in = shift;
my ($int,$frac) = split /\./, $in;
my $sign = $int =~ s/^-//;
my $out = oct "0b$int";
my $mult = 1;
for my $digit (split //, $frac) {
$mult *= .5;
$out += $mult * $digit;
}
$out = -$out if $sign;
return $out;
}
Below is a machine- and build-specific implementation (NV = little-endian double).
It returns the number stored exactly, and it supports NaN, Infinity, -Infinity and -0 and subnormals. It trims leading zeros and trailing decimal zeroes.
sub double_to_bin {
my ($n) = #_;
my ($s, $e, $m) = unpack 'a a11 a52', unpack 'B64', "".reverse pack 'F', $n;
$s = $s ? '-' : '';
$e = oct("0b$e");
if ($e == 0x7ff) {
return ($m =~ /1/) ? 'NaN' : $s . 'Infinity'
} elsif ($e == 0x000) {
$m = "0$m"; $e -= 52;
} else {
$m = "1$m"; $e -= 1075;
}
if ($e >= 0) {
$m .= ('0' x $e);
} elsif ($e >= -52) {
substr($m, $e+53, 0, '.');
} else {
$m = '0.' . ('0' x (-$e-53)) . $m;
}
$m =~ s/^0+(?!\.)//;
$m =~ s/(?:\..*1\K|\.)0+\z//;
return $s . $m;
}
Here's a sketch of an interesting "portable" implementation. It doesn't handle any of the interesting edge-cases like integers, NaNs, infinities, or even negative numbers because I'm lazy, but extending it wouldn't be so hard.
(my $bin = sprintf "%b.%032b", int($num), 2**32 * ($num - int($num)))
=~ s/\.?0+$//;
The 2**32 seems like an architecture-specific magic number but in fact it's basically just how many bits of precision you want after the dot. Too small and you get harmless truncation; too large and there's potential for overflow (since %b probably casts to UV sometime before doing its formatting).
$TO_BIN = '-b';
$TO_DEC = '-d';
($op, $n ) = #ARGV;
die("USAGE: $0 -b <dec_to_convert> | -d <bin_to_convert>\n") unless ( $op =~ /^($TO_BIN|$TO_DEC)$/ && $n );
for (split(//,$n)) {
if ($_ eq ".") {
$f=".";
} else {
if (defined $f) { $f.=$_ } else { $i.=$_ }
}
}
$ci = sprintf("%b", $i) if $op eq $TO_BIN;
$ci = sprintf("%d", eval "0b$i") if $op eq $TO_DEC;
#f=split(//,$f) if $f;
if ($op eq $TO_BIN) {
while( $f && length($cf) < 16 ) {
($f *= 2) =~ s/(\d)(\.?.*)/$2/;
$cf .= $1 ? '1' : '0';
}
} else {
for ($i=1;$i<#f;$i++) {
$cf = ($cf + $f[#f-$i])/2;
}
}
$cf=~s/^.*\.|^/./ if $cf;
print("$ci$cf\n");