I have a simple script written in perl, and i keep getting this particular error when i try to run. The script is for generating some numbers for use in checking integer to floating point. This is the particular error i get.
Can't use an undefined value as an ARRAY reference at /tools/oss/packages/i86pc-5.10/perl/5.8.8-32/lib/5.8.8/Math/BigInt/Calc.pm line 1180
From the error message am not able to figure out where my code is going wrong. By the way i need to use 64 bit numbers. How do i debug this issue?
Here is the code sample
use bignum;
use warnings;
use strict;
open(VCVT, ">CvtIntToFp") or die "couldn't open file to write:$!";
my $number;
my $sgn;
# left with 31 bits excluding the sign
# 23 bits of significand needed, all the result
# will be exact except where leading bit ignoring singn is >23
# take its 2's complement to get the negative number and put it
# into the register
# 32 bit number 1 bit sign 31 left any number with leading 1 #position >23 (counting from 0) will be inexact when in floating point
# 30-24 bit positons can have a leading ones at at any position and result is an inexact
my $twoPwr32 = 0x100000000; #2**32
my #num=();
for(my $i=0; $i<100; $i++)
{
$sgn = (rand()%2);
my $tempLead = (rand()%7); # there are 7 bits from 24 to 30
$number=$tempLead << 24;
if($sgn)
{$number = ($twoPwr32- $number +1) & 0xffffffff;
}
$number = sprintf("%x", $number);
push(#num, $number);
}
my $item=0;
foreach $item (#num)
{
print "$item\n";
print VCVT "$item\n";
}
Try using use diagnostics to get a better error message and read perldoc bignum. The error and explanation is given there that usage of bignum internally converts the numbers into bignum and returns a reference. Since I have perl 5.14 documentation I have the link for documentation of perl 5.20 and I think the bug still exists. Refer to http://perldoc.perl.org/bignum.html
Update :
Hexadecimal number > 0xffffffff non-portable at throw_stack.pl line 19 (#1)
(W portable) The hexadecimal number you specified is larger than 2**32-1
(4294967295) and therefore non-portable between systems. See
perlport for more on portability concerns.
Also refer to this question for the usage of 64 bit arithmetic in Perl.
Related
How can we bit select the variables in perl code?
I am new to perl and I have a scenario where I need to extract a particular format from a file and give input to a another module for analysis.
Currently I have extracted the required pattern which is in 16-bit Hexadecimal.
Now from this 16bits hexadecimal format ,I want only LSB 10bits.
Kindly refer the below example(This is a sample code where I have used only 1 line of my requirement)
use strict;
my $string = "HDR 0c0d PlD 1000 GAP 412";
$string =~ s/.*HDR\s(\S+).*/$1/g;
print "$string\n";
my $hex = hex($string);
print "$hex";
The output in $hex is 3095 which is 16bit 16’b0011000010000101 now I need to extract only the LSB 10bits(0010000101), Please let me know some easy way to do this .
Use bit mask to select bits which you need. To select right 10 bit you can use:
my $x = 0xfff0;
print $x & 0x3ff;
output is
1008
which is decimal number of ten bits of number 0xfff0
I have a need to convert a long decimal number :14245138929982432 in to hexadecimal value whose value is 329BE0DDB29BE0.
But when i am trying to use below piece of code i am getting result as FFFFFFFF.
$l_KeyExpected_UL=14245138929982432;
$hexadeci = sprintf '%X', $l_KeyExpected_UL;
print $hexadeci."\n";\
can Any one please help on this.
You can use the Math::BigInt module, which does all the heavy-lifting behind the bigint pragma
You can't use sprintf under bigint to convert large decimals to hex because its conversions aren't overloaded, but Math::BigInt objects have an as_hex method which returns the number expressed as a hex string, prefixed with 0x
This program wraps the conversion in a subroutine bigint_to_hex, which removes the 0x prefix and changes lower case to upper case
It is unusual now to encounter an installation of Perl that won't handle 64-bit values anyway. But this method will work with any decimal string, as I have demonstrated by converting a value of 1.2E40 as well as the value in your question
It's vital that you pass big integers as strings, because numeric literals will be converted to floating point by the compiler if they exceed the width of an ordinary integer. My program also prints the hex equivalent of the same 1.2E40 value without quotation marks so that you can see the difference
use strict;
use warnings 'all';
use feature 'say';
use Math::BigInt;
my $l_KeyExpected_UL = '14245138929982432';
say bigint_to_hex($l_KeyExpected_UL);
say bigint_to_hex('12345678901234567890123456789012345678901');
say bigint_to_hex(12345678901234567890123456789012345678901);
say bigint_to_hex(1.2E40);
sub bigint_to_hex {
my $hex = Math::BigInt->new(shift)->as_hex;
$hex =~ s/^0x//;
uc $hex;
}
output
329BE0DDB29BE0
2447DB449988978536BF5BBBE40E766C35
2447DB449988B214BEA48F651CA8000000
2343CBEEEA6F2C193478C00E0000000000
If you have a build of Perl with 64-bit integers
A Perl build with 64-bit ints will display the following:
$ perl -V:ivsize
ivsize='8';
If this is what you have, you can simply use
my $i = 14245138929982432;
my $hex = sprintf('%X', $i);
If you have a build of Perl that may not have 64-bit integers
That number requires at least 56 bits to store.
Math::Int64 allows you to access native 64-bit ints.
use Math::Int64 qw( uint64 int64_to_hex );
# Note that the number is provided as a string!
my $i = uint64('14245138929982432');
my $hex = int64_to_hex($i);
You could also use an arbitrary precision library such as Math::BigInt, but it will be necessarily slower. If you're going to do some number crunching with 64-bit ints, you definitely want Math::Int64 instead of Math::BigInt.
The perl negative look ahead is not working on large strings ( length > 40000, in active perl and cygwin perl, version 5.14 ). I tried the same code with mingw perl 5.8.8 and it stops working for strings with length > 5000.
The code I am using is:
my $str = q(A B);
my $pattern = '(A)(?:(?!(X)).)*(B)';
if ( $str =~ m/$pattern/ ) {
print "matched\n";
}
This works fine for all three versions of the perl. But when I increase the length of the string by adding spaces, the pattern stops matching.
for e.g.: my $str = q(A ...some 50000 spaces... B);
Kindly help.
Perl imposes an internal limit (happens to be a signed 16-bit integer on most systems) on the size of various regex operations to limit stack growth. This answer has a very good breakdown of the limit.
From empirical testing, when the space count gets to 32767, that's when you fail, so it's certainly this limit.
Looking for suggestions on how to approach my Perl programming homework assignment to write an RNA synthesis program. I've summed and outlined the program below. Specifically, I'm looking for feedback on the blocks below (I'll number for easy reference). I've read up to chapter 6 in Elements of Programming with Perl by Andrew Johnson (great book). I've also read the perlfunc and perlop pod-pages with nothing jumping out on where to start.
Program Description: The program should read an input file from the command line, translate it into RNA, and then transcribe the RNA into a sequence of uppercase one-letter amino acid names.
Accept a file named on the command line
here I will use the <> operator
Check to make sure the file only contains acgt or die
if ( <> ne [acgt] ) { die "usage: file must only contain nucleotides \n"; }
Transcribe the DNA to RNA (Every A replaced by U, T replaced by A, C replaced by G, G replaced by C)
not sure how to do this
Take this transcription & break it into 3 character 'codons' starting at the first occurance of "AUG"
not sure but I'm thinking this is where I will start a %hash variables?
Take the 3 character "codons" and give them a single letter Symbol (an uppercase one-letter amino acid name)
Assign a key a value using (there are 70 possibilities here so I'm not sure where to store or how to access)
If a gap is encountered a new line is started and process is repeated
not sure but we can assume that gaps are multiples of threes.
Am I approaching this the right way? Is there a Perl function that I'm overlooking that can simplify the main program?
Note
Must be self contained program (stored values for codon names & symbols).
Whenever the program reads a codon that has no symbol this is a gap in the RNA, it should start a new line of output and begin at the next occurance of "AUG". For simplicity we can assume that gaps are always multiples of threes.
Before I spend any additional hours on research I am hoping to get confirmation that I'm taking the right approach. Thanks for taking time to read and for sharing your expertise!
1. here I will use the <> operator
OK, your plan is to read the file line by line. Don't forget to chomp each line as you go, or you'll end up with newline characters in your sequence.
2. Check to make sure the file only contains acgt or die
if ( <> ne [acgt] ) { die "usage: file must only contain nucleotides \n"; }
In a while loop, the <> operator puts the line read into the special variable $_, unless you assign it explicitly (my $line = <>).
In the code above, you're reading one line from the file and discarding it. You'll need to save that line.
Also, the ne operator compares two strings, not one string and one regular expression. You'll need the !~ operator here (or the =~ one, with a negated character class [^acgt]. If you need the test to be case-insensitive, look into the i flag for regular expression matching.
3. Transcribe the DNA to RNA (Every A replaced by U, T replaced by A, C replaced by G, G replaced by C).
As GWW said, check your biology. T->U is the only step in transcription. You'll find the tr (transliterate) operator helpful here.
4. Take this transcription & break it into 3 character 'codons' starting at the first occurance of "AUG"
not sure but I'm thinking this is where I will start a %hash variables?
I would use a buffer here. Define an scalar outside the while(<>) loop. Use index to match "AUG". If you don't find it, put the last two bases on that scalar (you can use substr $line, -2, 2 for that). On the next iteration of the loop append (with .=) the line to those two bases, and then test for "AUG" again. If you get a hit, you'll know where, so you can mark the spot and start translation.
5. Take the 3 character "codons" and give them a single letter Symbol (an uppercase one-letter amino acid name)
Assign a key a value using (there are 70 possibilities here so I'm not sure where to store or how to access)
Again, as GWW said, build a hash table:
%codons = ( AUG => 'M', ...).
Then you can use (for eg.) split to build an array of the current line you're examining, build codons three elements at a time, and grab the correct aminoacid code from the hash table.
6.If a gap is encountered a new line is started and process is repeated
not sure but we can assume that gaps are multiples of threes.
See above. You can test for the existence of a gap with exists $codons{$current_codon}.
7. Am I approaching this the right way? Is there a Perl function that I'm overlooking that can simplify the main program?
You know, looking at the above, it seems way too complex. I built a few building blocks; the subroutines read_codon and translate: I think they help the logic of the program immensely.
I know this is a homework assignment, but I figure it might help you get a feel for other possible approaches:
use warnings; use strict;
use feature 'state';
# read_codon works by using the new [state][1] feature in Perl 5.10
# both #buffer and $handle represent 'state' on this function:
# Both permits abstracting reading codons from processing the file
# line-by-line.
# Once read_colon is called for the first time, both are initialized.
# Since $handle is a state variable, the current file handle position
# is never reset. Similarly, #buffer always holds whatever was left
# from the previous call.
# The base case is that #buffer contains less than 3bp, in which case
# we need to read a new line, remove the "\n" character,
# split it and push the resulting list to the end of the #buffer.
# If we encounter EOF on the $handle, then we have exhausted the file,
# and the #buffer as well, so we 'return' undef.
# otherwise we pick the first 3bp of the #buffer, join them into a string,
# transcribe it and return it.
sub read_codon {
my ($file) = #_;
state #buffer;
open state $handle, '<', $file or die $!;
if (#buffer < 3) {
my $new_line = scalar <$handle> or return;
chomp $new_line;
push #buffer, split //, $new_line;
}
return transcribe(
join '',
shift #buffer,
shift #buffer,
shift #buffer
);
}
sub transcribe {
my ($codon) = #_;
$codon =~ tr/T/U/;
return $codon;
}
# translate works by using the new [state][1] feature in Perl 5.10
# the $TRANSLATE state is initialized to 0
# as codons are passed to it,
# the sub updates the state according to start and stop codons.
# Since $TRANSLATE is a state variable, it is only initialized once,
# (the first time the sub is called)
# If the current state is 'translating',
# then the sub returns the appropriate amino-acid from the %codes table, if any.
# Thus this provides a logical way to the caller of this sub to determine whether
# it should print an amino-acid or not: if not, the sub will return undef.
# %codes could also be a state variable, but since it is not actually a 'state',
# it is initialized once, in a code block visible form the sub,
# but separate from the rest of the program, since it is 'private' to the sub
{
our %codes = (
AUG => 'M',
...
);
sub translate {
my ($codon) = #_ or return;
state $TRANSLATE = 0;
$TRANSLATE = 1 if $codon =~ m/AUG/i;
$TRANSLATE = 0 if $codon =~ m/U(AA|GA|AG)/i;
return $codes{$codon} if $TRANSLATE;
}
}
I can give you a few hints on a few of your points.
I think your first goal should be to parse the file character by character, ensuring each one is valid, group them into sets of three nucleotides and then work on your other goals.
I think your biology is a bit off as well, when you transcribe DNA to RNA you need to think about what strands are involved. You may not need to "complement" your bases during your transcription step.
2. You should check this as your parse the file character by character.
3. You could do this with a loop and some if statements or hash
4. This could probably be done with a counter as you read the file character by character. Since you need to insert a space after every 3rd character.
5. This would be a good place to use a hash that's based on the amino acid codon table.
6. You'll have to look for the gap character as you parse the file. This seems to contradict your #2 requirement since the program says your text can only contain ATGC.
There are a lot of perl functions that could make this easier. There are also perl modules such as bioperl. But I think using some of these could defeat the purpose of your assignment.
Look at BioPerl and browse the source-modules for indicators on how to go about it.
In Perl, is it appropriate to use a string as a byte array containing 8-bit data? All the documentation I can find on this subject focuses on 7-bit strings.
For instance, if I read some data from a binary file into $data
my $data;
open FILE, "<", $filepath;
binmode FILE;
read FILE $data 1024;
and I want to get the first byte out, is substr($data,1,1) appropriate? (again, assuming it is 8-bit data)
I come from a mostly C background, and I am used to passing a char pointer to a read() function. My problem might be that I don't understand what the underlying representation of a string is in Perl.
The bundled documentation for the read command, reproduced here, provides a lot of information that is relevant to your question.
read FILEHANDLE,SCALAR,LENGTH,OFFSET
read FILEHANDLE,SCALAR,LENGTH
Attempts to read LENGTH characters of data into variable SCALAR
from the specified FILEHANDLE. Returns the number of
characters actually read, 0 at end of file, or undef if there
was an error (in the latter case $! is also set). SCALAR will
be grown or shrunk so that the last character actually read is
the last character of the scalar after the read.
An OFFSET may be specified to place the read data at some place
in the string other than the beginning. A negative OFFSET
specifies placement at that many characters counting backwards
from the end of the string. A positive OFFSET greater than the
length of SCALAR results in the string being padded to the
required size with "\0" bytes before the result of the read is
appended.
The call is actually implemented in terms of either Perl's or
system's fread() call. To get a true read(2) system call, see
"sysread".
Note the characters: depending on the status of the filehandle,
either (8-bit) bytes or characters are read. By default all
filehandles operate on bytes, but for example if the filehandle
has been opened with the ":utf8" I/O layer (see "open", and the
"open" pragma, open), the I/O will operate on UTF-8 encoded
Unicode characters, not bytes. Similarly for the ":encoding"
pragma: in that case pretty much any characters can be read.
See perldoc -f pack and perldoc -f unpack for how to treat strings as byte arrays.
You probably want to use sysopen and sysread if you want to read bytes from binary file.
See also perlopentut.
Whether this is appropriate or necessary depends on what exactly you are trying to do.
#!/usr/bin/perl -l
use strict; use warnings;
use autodie;
use Fcntl;
sysopen my $bin, 'test.png', O_RDONLY;
sysread $bin, my $header, 4;
print map { sprintf '%02x', ord($_) } split //, $header;
Output:
C:\Temp> t
89504e47
Strings are strings of "characters", which are bigger than a byte.1 You can store bytes in them and manipulate them as though they are characters, taking substrs of them and so on, and so long as you're just manipulating entities in memory, everything is pretty peachy. The data storage is weird, but that's mostly not your problem.2
When you try to read and write from files, the fact that your characters might not map to bytes becomes important and interesting. Not to mention annoying. This annoyance is actually made a bit worse by Perl trying to do what you want in the common case: If all the characters in the string fit into a byte and you happen to be on a non-Windows OS, you don't actually have to do anything special to read and write bytes. Perl will complain, however, if you have stored a non-byte-sized character and try to write it without giving it a clue about what to do with it.
This is getting a little far afield, largely because encoding is a large and confusing topic. Let me leave it off there with some references: Look at Encode(3perl), open(3perl), perldoc open, and perldoc binmode for lots of hilarious and gory details.
So the summary answer is "Yes, you can treat strings as though they contained bytes if they do in fact contain bytes, which you can assure by only reading and writing bytes.".
1: Or pedantically, "which can express a larger range of values than a byte, though they are stored as bytes when that is convenient". I think.
2: For the record, strings in Perl are internally represented by a data structure called a 'PV' which in addition to a character pointer knows things like the length of the string and the current value of pos.3
3: Well, it will start storing the current value of pos if it starts being interesting. See also
use Devel::Peek;
my $x = "bluh bluh bluh bluh";
Dump($x);
$x =~ /bluh/mg;
Dump($x);
$x =~ /bluh/mg;
Dump($x);
It might help more if you tell us what you are trying to do with the byte array. There are various ways to work with binary data, and each lends itself to a different set of tools.
Do you want to convert the data into a Perl array? If so, pack and unpack are a good start. split could also come in handy.
Do you want to access individual elements of the string without unpacking it? If so, substr is fast and will do the trick for 8 byte data. If you want other bit depths, take a look at the vec function, which treads a string as a bit vector.
Do you want to scan the string and convert certain bytes to other bytes? Then the s/// or tr/// constructs might be useful.
Allow me just to post a small example about treating string as binary array - since I myself found it difficult to believe that something called "substr" would handle null bytes; but seemingly it does - below is a snippet of a perl debugger terminal session (with both string and array/list approaches):
$ perl -d
Loading DB routines from perl5db.pl version 1.32
Editor support available.
Enter h or `h h' for help, or `man perldebug' for more help.
^D
Debugged program terminated. Use q to quit or R to restart,
use o inhibit_exit to avoid stopping after program termination,
h q, h R or h o to get additional info.
DB<1> $str="\x00\x00\x84\x00"
DB<2> print $str
�
DB<3> print unpack("H*",$str) # show content of $str as hex via `unpack`
00008400
DB<4> $str2=substr($str,2,2)
DB<5> print unpack("H*",$str2)
8400
DB<6> $str2=substr($str,1,3)
DB<7> print unpack("H*",$str2)
008400
[...]
DB<30> #stra=split('',$str); print #stra # convert string to array (by splitting at empty string)
�
DB<31> print unpack("H*",$stra[3]) # print indiv. elems. of array as hex
00
DB<32> print unpack("H*",$stra[2])
84
DB<33> print unpack("H*",$stra[1])
00
DB<34> print unpack("H*",$stra[0])
00
DB<35> print unpack("H*",join('',#stra[1..3])) # print only portion of array/list via indexes (using flipflop [two dots] operator)
008400