SOLVED: Hash content access is inconsistent with different perl version - perl

I came across an interesting problem with following piece of code in perl 5.22.1 and perl 5.30.0
use strict;
use warnings;
use feature 'say';
#use Data::Dumper;
my %hash;
my %seen;
my #header = split ',', <DATA>;
chomp #header;
while(<DATA>) {
next if /^\s*$/;
chomp;
my %data;
#data{#header} = split ',';
push #{$hash{person}}, \%data;
push #{$hash{Position}{$data{Position}}}, "$data{First} $data{Last}";
if( ! $seen{$data{Position}} ) {
$seen{$data{Position}} = 1;
push #{$hash{Role}}, $data{Position};
}
}
#say Dumper($hash{Position});
my $count = 0;
for my $person ( #{$hash{person}} ) {
say "Person: $count";
say "Role: $person->{Position}";
}
say "---- Groups ----\n";
while( my($p,$m) = each %{$hash{Position}} ) {
say "-> $p";
my $members = join(',',#{$m});
say "-> Members: $members\n";
}
say "---- Roles ----";
say '-> ' . join(', ',#{$hash{Role}});
__DATA__
First,Last,Position
John,Doe,Developer
Mary,Fox,Manager
Anna,Gulaby,Developer
If the code run as it is -- everything works fine
Now it is sufficient to add $count++ increment as bellow and code produces errors
my $count = 0;
for my $person ( #{$hash{person}} ) {
$count++;
say "Person: $count";
say "Role: $person->{Position}";
}
Errors:
Error(s), warning(s):
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 22, <DATA> line 2.
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 23, <DATA> line 2.
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 24, <DATA> line 2.
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 22, <DATA> line 3.
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 23, <DATA> line 3.
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 22, <DATA> line 4.
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 23, <DATA> line 4.
Use of uninitialized value in concatenation (.) or string at source_file.pl line 35, <DATA> line 4.
Use of uninitialized value in concatenation (.) or string at source_file.pl line 35, <DATA> line 4.
Use of uninitialized value in concatenation (.) or string at source_file.pl line 35, <DATA> line 4.
Use of uninitialized value in join or string at source_file.pl line 48, <DATA> line 4.
This problem does not manifest itself in perl 5.30.0 (Windows 10, Strawberry Perl) or Perl v5.24.2.
Note: the problem manifests itself not only with $count++ but with any other access to content of the hash next to say "Person: $count"; -- post# 60653651
I would like to hear comments on this situation, what is the cause?
CAUSE: input data have eol in DOS form \r\n and when data processed in Linux chomp removes only \n leaving \r as part of the field name (used as hash key). Thanks goes to Shawn for pointing out the source of the issue.
SOLUTION: universal fix was implemented in form of snip_eol($arg) subroutine
use strict;
use warnings;
use feature 'say';
my $debug = 0;
say "
Perl: $^V
OS: $^O
-------------------
" if $debug;
my %hash;
my %seen;
my #header = split ',', <DATA>;
$header[2] = snip_eol($header[2]); # problem fix
while(<DATA>) {
next if /^\s*$/;
my $line = snip_eol($_); # problem fix
my %data;
#data{#header} = split ',',$line;
push #{$hash{person}}, \%data;
push #{$hash{Position}{$data{Position}}}, "$data{First} $data{Last}";
if( ! $seen{$data{Position}} ) {
$seen{$data{Position}} = 1;
push #{$hash{Role}}, $data{Position};
}
}
#say Dumper($hash{Position});
my $count = 0;
for my $person ( #{$hash{person}} ) {
$count++;
say "-> Name: $person->{First} $person->{Last}";
say "-> Role: $person->{Position}\n";
}
say "---- Groups ----\n";
while( my($p,$m) = each %{$hash{Position}} ) {
say "-> $p";
my $members = join(',',#{$m});
say "-> Members: $members\n";
}
say "---- Roles ----";
say '-> ' . join(', ',#{$hash{Role}});
sub snip_eol {
my $data = shift; # problem fix
#map{ say "$_ => " . ord } split '', $data if $debug;
$data =~ s/\r// if $^O eq 'linux';
chomp $data;
#map{ say "$_ => " . ord } split '', $data if $debug;
return $data;
}
__DATA__
First,Last,Position
John,Doe,Developer
Mary,Fox,Manager
Anna,Gulaby,Developer

I can replicate this behavior by (On linux) first converting the source file to have Windows-style \r\n line endings and then trying to run it. I thus suspect that in your testing of various versions you're using Windows sometimes, and a Linux/Unix other times, and not converting the file's line endings appropriately.
#chomp only removes a newline character (Well, the current value of $/ to be pedantic), so when used on a string with a Windows style line ending in it, it leaves the carriage return. The hash key is not "Position", it's "Position\r", which is not what the rest of your code uses.

Related

Perl - Compare <STDIN> to Hash Key & Value

I am trying to compare if two inputs ($name, $place) match the respective key and value of a hash. So, if $name matches a key and $place matches that key's value, "Correct" is printed. My code unfortunately is incorrect. Any suggestions? Thanks!
use 5.010;
use strict;
use warnings;
my ($name, $place, %hash, %hash2);
%hash = (
Dominic => 'Melbourne',
Stella => 'Beijing',
Alex => 'Oakland',
);
%hash2 = reverse %hash;
print "Enter name: ";
$name = <STDIN>;
print "Enter place: ";
$place = <STDIN>;
chomp ($name, $place);
if ($name eq $hash{$name} && $place eq $hash2{$place}) {
print "Correct!\n";
} else {
print "NO!\n";
}
While a lot may be done to correct this (unrelated to the question), but here is the minimal solution necessary:
use 5.010;
use strict;
use warnings;
my %hash = (
Dominic => 'Melbourne',
Stella => 'Beijing',
Alex => 'Oakland',
);
print "Enter name: ";
my $name = <STDIN>;
print "Enter place: ";
my $place = <STDIN>;
if ($name and $place) {
chomp ($name, $place);
if (exists($hash{$name}) and ($place eq $hash{$name})) {
print "Correct!\n";
} else {
print "NO!\n";
}
} else {
print "ERROR: Both name and place required to make this work!";
}
As you are reading from STDIN you need to sanity check the input otherwise you get these problems in your result (not to mention the "Correct!" at the end) with unexpected input:
Enter name:
Enter place:
Use of uninitialized value $name in chomp at original.pl line 19.
Use of uninitialized value $place in chomp at original.pl line 19.
Use of uninitialized value $name in hash element at original.pl line 22.
Use of uninitialized value $name in string eq at original.pl line 22.
Use of uninitialized value in string eq at original.pl line 22.
Use of uninitialized value $place in hash element at original.pl line 22.
Use of uninitialized value $place in string eq at original.pl line 22.
Use of uninitialized value in string eq at original.pl line 22.
Correct!
Instead of this that should be generated with error checked code:
Enter name:
Enter place:
ERROR: Both name and place required to make this work!
PS: Please bear with my variable declaration changes, it's just OCD from me, unrelated to the question at hand. Like I said a lot could be done.

Error use of uninitialized value although it is initialized

I am trying to make a table looking content of one input file but it constantly gives me an error
Use of uninitialized value $ac[3] in concatenation (.) or string at table.pl
line 58 (#1)
and
Use of uninitialized value $or[2] in concatenation (.) or string at table.pl
line 61 (#1)
and although I made almost every possible changes it still gives me an error and does not print well.
This is how my input file looks like:
HEADER OXIDOREDUCTASE 08-JUN-12 2LU5
EXPDTA SOLID-STATE NMR
REMARK 2 RESOLUTION. NOT APPLICABLE.
HETNAM CU COPPER (II) ION
HETNAM ZN ZINC
FORMUL 2 CU CU 2+
FORMUL 2 ZN ZN 2+
END
This is a script I am using:
#!/usr/bin/env perl
use strict;
use warnings;
use diagnostics;
#my $testfile=shift;
open(INPUT, "$ARGV[0]") or die 'Cannot make it';
my #file=<INPUT>;
close INPUT;
my #ac=();
my #dr=();
my #os=();
my #or=();
my #fo=();
for (my $line=0;$line<=$#file;$line++)
{
chomp($file[$line]);
if ($file[$line] =~ /^HEADER/)
{
print( (split '\s+', $file[$line])[-1]);
print "\t";
while ($file[$line] !~ /^END /)
{
$line++;
if ($file[$line]=~/^EXPDTA/)
{
$file[$line]=~s/^EXPDTA//;
#os=(#os,split '\s+', $file[$line]);
}
if ($file[$line] =~ /^REMARK 2 RESOLUTION./)
{
$file[$line]=~s/^REMARK 2 RESOLUTION.//;
#ac = (#ac,split'\s+',$file[$line]);
}
if ($file[$line] =~ /^HETNAM/)
{
$file[$line]=~s/^HETNAM//;
$file[$line] =~ s/\s+//;
push #dr, $file[$line];
}
if ($file[$line] =~ /^SOURCE 2 ORGANISM_SCIENTIFIC/)
{
$file[$line]=~s/^SOURCE 2 ORGANISM_SCIENTIFIC//;
#or = (#or,split'\s+',$file[$line]);
}
if ($file[$line] =~ /^FORMUL/)
{
$file[$line]=~s/^FORMUL//;
$file[$line] =~ s/\s+//;
push #fo, $file[$line];
}
}
print "$os[1] $os[2]\t";
print "\t";
#os=();
print "$ac[3] $ac[4]\t" or die "Cannot be printed"; #line 58
print "\t";
#ac=();
print "$or[2] $or[3]\t" or die "Cannot be printed"; #line 61
print "\t";
#or=();
foreach (#dr)
{
print "$_";
print "\t\t\t\t\t";
}
#dr=();
print "\n";
}
}
And this is the output it gives me, but it doesnt seems to print well and I am really not sure why:
2LU5 SOLID-STATE NMR CU COPPER (II) ION
Desired output that I am expecting is :
HEADER EXPDTA REMARK HETNAM FORMUL
OXIDOREDUCTASE 2LU5 SOLID-STATE NMR RESOLUTION. NOT APPLICABLE. COPPER (II) ION (here better to say last column because certain diversity exists before "copper") CU 2+
ZN ZINC ZN 2+
The root of your error is that:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my #ac = ();
my $str = "REMARK 2 RESOLUTION. NOT APPLICABLE. ";
$str =~ s/^REMARK 2 RESOLUTION.//;
#ac = ( #ac, split '\s+', $str );
print Dumper \#ac;
The contents of #ac is:
$VAR1 = [
'',
'NOT',
'APPLICABLE.'
];
There is no $ac[3], you only have elements 0,1,2 in there.
With your #or error, you don't have any lines matching: /^SOURCE 2 ORGANISM_SCIENTIFIC/
So that array is empty, and that too, means you've got no $or[2] to print.
More generally - what you're doing here is actually really quite clunky, and there's a much cleaner solution.
How about:
#!/usr/bin/env perl
use strict;
use warnings;
#set the text "END" as our record separator
local $/ = 'END';
#define the fields to print out.
my #field_order = qw ( HEADER EXPDTA REMARK HETNAM FORMUL );
print join ( ",", #field_order), "\n"; #print header row
#iterate STDIN or file named on command line.
#just like you're doing with open (FILE, $ARGV[0])
while ( <> ) {
#select key value pairs into a hash - first word on the line is the 'key'
#and the value is 'anything else'.
my %this_entry = m/^(\w+)\s+(.*)$/gm;
next unless $this_entry{'HEADER'}; #check we have a header.
s/\s+/ /g for values %this_entry; #strip repeated spaces from fields;
s/\s+$//g for values %this_entry; #strip trailing whitespace.
#split 'header' row into separate subfields
#this is an example of how you could transform other fields.
($this_entry{'HEADER'}, $this_entry{'DATE'}, $this_entry{'STRUCT'} ) = split ' ', $this_entry{'HEADER'};
print join (",", #this_entry{#field_order} ), "\n";
}
This will - given your input - print:
HEADER,DATE,STRUCT,EXPDTA,REMARK,HETNAM,FORMUL
OXIDOREDUCTASE,08-JUN-12,2LU5,SOLID-STATE NMR,2 RESOLUTION. NOT APPLICABLE.,CU COPPER (II) ION,2 CU CU 2+
Which isn't quite what your output matches, but hopefully it's illustrated how much simpler this task could be?

Sorting 5th column in descending order error message

The text file I am trying to sort:
MYNETAPP01-NY
700000123456
Filesystem total used avail capacity Mounted on
/vol/vfiler_PROD1_SF_NFS15K01/ 1638GB 735GB 903GB 45% /vol/vfiler_PROD1_SF_NFS15K01/
/vol/vfiler_PROD1_SF_NFS15K01/.snapshot 409GB 105GB 303GB 26% /vol/vfiler_PROD1_SF_NFS15K01/.snapshot
/vol/vfiler_PROD1_SF_isci_15K01/ 2048GB 1653GB 394GB 81% /vol/vfiler_PROD1_SF_isci_15K01/
snap reserve 0TB 0TB 0TB ---% /vol/vfiler_PROD1_SF_isci_15K01/..
I am trying to sort this text file by its 5th column (the capacity field) in descending order.
When I first started this there was a percentage symbol mixed with the numbers. I solved this by substituting the the value like so: s/%/ %/g for #data;. This made it easier to sort the numbers alone. Afterwards I will change it back to the way it was with s/ %/%/g.
After running the script, I received this error:
#ACI-CM-L-53:~$ ./netapp.pl
Can't use string ("/vol/vfiler_PROD1_SF_isci_15K01/"...) as an ARRAY ref while "strict refs" in use at ./netapp.pl line 20, line 24 (#1)
(F) You've told Perl to dereference a string, something which
use strict blocks to prevent it happening accidentally. See
"Symbolic references" in perlref. This can be triggered by an # or $
in a double-quoted string immediately before interpolating a variable,
for example in "user #$twitter_id", which says to treat the contents
of $twitter_id as an array reference; use a \ to have a literal #
symbol followed by the contents of $twitter_id: "user \#$twitter_id".
Uncaught exception from user code:
Can't use string ("/vol/vfiler_PROD1_SF_isci_15K01/"...) as an ARRAY ref while "strict refs" in use at ./netapp.pl line 20, <$DATA> line 24.
#!/usr/bin/perl
use strict;
use warnings;
use diagnostics;
open (my $DATA, "<raw_info.txt") or die "$!";
my $systemName = <$DATA>;
my $systemSN = <$DATA>;
my $header = <$DATA>;
my #data;
while ( <$DATA> ) {
#data = (<$DATA>);
}
s/%/ %/g for #data;
s/---/000/ for #data;
print #data;
my #sorted = sort { $b->[5] <=> $a->[5] } #data;
print #sorted;
close($DATA);
Here is an approach using Text::Table which will nicely align your output into neat columns.
#!/usr/bin/perl
use strict;
use warnings;
use Text::Table;
open my $DATA, '<', 'file1' or die $!;
<$DATA> for 1 .. 2; # throw away first two lines
chomp(my $hdr = <$DATA>); # header
my $tbl = Text::Table->new( split ' ', $hdr, 6 );
$tbl->load( map [split /\s{2,}/], sort by_percent <$DATA> );
print $tbl;
sub by_percent {
my $keya = $a =~ /(\d+)%/ ? $1 : '0';
my $keyb = $b =~ /(\d+)%/ ? $1 : '0';
$keyb <=> $keya
}
The output generated is:
Filesystem total used avail capacity Mounted on
/vol/vfiler_PROD1_SF_isci_15K01/ 2048GB 1653GB 394GB 81% /vol/vfiler_PROD1_SF_isci_15K01/
/vol/vfiler_PROD1_SF_NFS15K01/ 1638GB 735GB 903GB 45% /vol/vfiler_PROD1_SF_NFS15K01/
/vol/vfiler_PROD1_SF_NFS15K01/.snapshot 409GB 105GB 303GB 26% /vol/vfiler_PROD1_SF_NFS15K01/.snapshot
snap reserve 0TB 0TB 0TB ---% /vol/vfiler_PROD1_SF_isci_15K01/..
Update
To explain some of the advanced parts of the program.
my $tbl = Text::Table->new( split ' ', $hdr, 6 );
This creates the Text::Table object with the header split into 6 columns. Without the limit of 6 columns, it would have created 7 columns (because the last field, 'mounted on', also contains a space. It would have been incorrectly split into 2 columns for a total of 7).
$tbl->load( map [split /\s{2,}/], sort by_percent <$DATA> );
The statement above 'loads' the data into the table. The map applies a transformation to each line from <$DATA>. Each line is split into an anonymous array, (created by [....]). The split is on 2 or more spaces, \s{2,}. If that wasn't specified, then the data `snap reserve' with 1 space would have been incorrectly split.
I hope this makes whats going on more clear.
And a simpler example that doesn't align the columns like Text::Table, but leaves them in the form they originally were read might be:
open my $DATA, '<', 'file1' or die $!;
<$DATA> for 1 .. 2; # throw away first two lines
my $hdr = <$DATA>; # header
print $hdr;
print sort by_percent <$DATA>;
sub by_percent {
my $keya = $a =~ /(\d+)%/ ? $1 : '0';
my $keyb = $b =~ /(\d+)%/ ? $1 : '0';
$keyb <=> $keya
}
In addition to skipping the fourth line of the file, this line is wrong
my #sorted = sort { $b->[5] <=> $a->[5] } #data
But presumably you knew that as the error message says
at ./netapp.pl line 20
$a and $b are lines of text from the array #data, but you're treating them as array references. It looks like you need to extract the fifth "field" from both variables before you compare them, but no one can tell you how to do that
You code is quite far from what you want. Trying to change it as little as possible, this works:
#!/usr/bin/perl
use strict;
use warnings;
open (my $fh, "<", "raw_info.txt") or die "$!";
my $systemName = <$fh>;
my $systemSN = <$fh>;
my $header = <$fh>;
my #data;
while( my $d = <$fh> ) {
chomp $d;
my #fields = split '\s{2,}', $d;
if( scalar #fields > 4 ) {
$fields[4] = $fields[4] =~ /(\d+)/ ? $1 : 0;
push #data, [ #fields ];
}
}
foreach my $i ( #data ) {
print join("\t", #$i), "\n";
}
my #sorted = sort { $b->[4] <=> $a->[4] } #data;
foreach my $i ( #sorted ) {
$i->[4] .= '%';
print join("\t", #$i), "\n";
}
close($fh);
Let´s make a few things clear:
If using the $ notation, it is customary to define file variables in lower case as $fd. It is also typical to name the file descriptor as "fd".
You define but not use the first three variables. If you don´t apply chomp to them, the final CR will be added to them. I have not done it as they are not used.
You are defining a list with a line in each element. But then you need a list ref inside to separate the fields.
The separation is done using split.
Empty lines are skipped by counting the number of fields.
I use something more compact to get rid of the % and transform the --- into a 0.
Lines are added to list #data using push and turning the list to add into a list ref with [ #list ].
A list of list refs needs two loops to get printed. One traverses the list (foreach), another (implicit in join) the columns.
Now you can sort the list and print it out in the same way. By the way, Perl lists (or arrays) start at index 0, so the 5th column is 4.
This is not the way I would have coded it, but I hope it is clear to you as it is close to your original code.

Use of uninitialized value but variables declared

use strict;
use warnings;
my $last_variable2= 'abc';
print "last var2 $last_variable2\n";
my #grouped;
while (<DATA>) {
my ($variable1,
$variable2,
$other_data) = split ',',$_,3;
if($variable2 ne 'abc'){
if( $variable2 ne $last_variable2){
print "\n\n";
print "'$variable2' doesn't equal '$last_variable2'\n";
my %HoA;
&process_data(#grouped_series);
#grouped = ();
}
}else{
print "Skipped this because it's the first\n";
}
push #grouped_series, $_;
$last_variable2 = $variable2;
}
When I run this code, I keep getting
Use of uninitialized value $last_variable2 in string ne at 1_1_correspondencer.pl line 32, <DATA> line 3.
Use of uninitialized value $variable2 in string ne at 1_1_correspondencer.pl line 33, <DATA> line 3.
Use of uninitialized value $last_variable2 in concatenation (.) or string at 1_1_correspondencer.pl line 36, <DATA> line 6.
But, I initialized both variables. Sorry, this is a naive question--I only just started using strict and warnings
When parsing your DATA, you don't verify that each of these variables is defined:
my ($variable1,
$variable2,
$other_data) = split ',',$_,3;
If there are no commas on a row, then $variable2 would be undefined which is later assigned to $last_variable2. Maybe add some data verification to take into account that case?
if (! defined $variable2) {
warn "missing variable2 definition: $_\n";
}
Without seeing your actual data, we can't really advise you more.

perl: Use of uninitialized value and output is truncated

I am trying to use the following script to shuffle the order of sequences (lines) within a file. I'm not sure how to "initialize" values -- please help!
print "Please enter filename (without extension): ";
my $input = <>;
chomp $input;
use strict;
use warnings;
print "Please enter total no. of sequence in fasta file: ";
my $orig_size = <>*2-1;
chomp $orig_size;
open INFILE, "$input.fasta"
or die "Error opening input file for shuffling!";
open SHUFFLED, ">"."$input"."_shuffled.fasta"
or die "Error creating shuffled output file!";
my #array = (0); # Need to initialise 1st element in array1&2 for the shift function
my #array2 = (0);
my $i = 1;
my $index = 0;
my $index2 = 0;
while (my #line = <INFILE>){
while ($i <= $orig_size) {
$array[$i] = $line[$index];
$array[$i] =~ s/(.)\s/$1/seg;
$index++;
$array2[$i] = $line[$index];
$array2[$i] =~ s/(.)\s/$1/seg;
$i++;
$index++;
}
}
my $array = shift (#array);
my $array2 = shift (#array2);
for ($i = my $header_size; $i >= 0; $i--) {
my $j = int rand ($i+1);
next if $i == $j;
#array[$i,$j] = #array[$j,$i];
#array2[$i,$j] = #array2[$j,$i];
}
while ($index2 <= my $header_size) {
print SHUFFLED "$array[$index2]\n";
print SHUFFLED "$array2[$index2]\n";
$index2++;
}
close INFILE;
close SHUFFLED;
I'm getting these warnings:
Use of uninitialized value in substitution (s///) at fasta_corrector6.pl line 27, <INFILE> line 578914.
Use of uninitialized value in substitution (s///) at fasta_corrector6.pl line 31, <INFILE> line 578914.
Use of uninitialized value in numeric ge (>=) at fasta_corrector6.pl line 40, <INFILE> line 578914.
Use of uninitialized value in addition (+) at fasta_corrector6.pl line 41, <INFILE> line 578914.
Use of uninitialized value in numeric eq (==) at fasta_corrector6.pl line 42, <INFILE> line 578914.
Use of uninitialized value in numeric le (<=) at fasta_corrector6.pl line 47, <INFILE> line 578914.
Use of uninitialized value in numeric le (<=) at fasta_corrector6.pl line 50, <INFILE> line 578914.
First, you read the whole input file in:
use IO::File;
my #lines = IO::File->new($file_name)->getlines;
then you shuffle it:
use List::Util 'shuffle';
my #shuffled_lines = shuffle(#lines);
then you write them out:
IO::File->new($new_file_name, "w")->print(#shuffled_lines);
There's an entry in the Perl FAQ about how to shuffle an array. Another entry tells of the many ways to read a file in one go. Perl FAQs contain a lot of samples and trivia on how to do many common things -- it's a good place to continue learning more about Perl.
On your previous question I gave this answer, and noted that your code failed because you had not initialized a variable named $header_size used in a loop condition. Not only have you repeated that mistake, you have elaborated on it by starting to declare the variable with my each time you try to access it.
for ($i = my $header_size; $i >= 0; $i--) {
# ^^--- wrong!
while ($index2 <= my $header_size) {
# ^^--- wrong!
A variable that is declared with my is empty (undef) by default. $index2 can never contain anything but undef here, and your loop will run only once, because 0 <= undef will evaluate true (albeit with an uninitialized warning).
Please take my advice and set a value for $header_size. And only use my when declaring a variable, not every time you use it.
A better solution
Seeing your errors above, it seems that your input files are rather large. If you have over 500,000 lines in your files, it means your script will consume large amounts of memory to run. It may be worthwhile for you to use a module such as Tie::File and work only with array indexes. For example:
use strict;
use warnings;
use Tie::File;
use List::Util qw(shuffle);
tie my #file, 'Tie::File', $filename or die $!;
for my $lineno (shuffle 0 .. $#file) {
print $line[$lineno];
}
untie #file; # all done
I cannot pinpoint what exactly went wrong, but there are a few oddities with your code:
The Diamond Operator
Perl's Diamond operator <FILEHANDLE> reads a line from the filehandle. If no filehandle is provided, each command line Argument (#ARGV) is treated as a file and read. If there are no arguments, STDIN is used. better specify this yourself. You also should chomp before you do arithemtics with the line, not afterwards. Note that strings that do not start with a number are treated as numeric 0. You should check for numericness (with a regex?) and include error handling.
The Diamond/Readline operator is context sensitive. If given in scalar context (e.g, a conditional, a scalar assignment) it returns one line. If given in list context, e.g. as a function parameter or an array assignment, it returns all lines as an array. So
while (my #line = <INFILE>) { ...
will not give you one line but all lines and is thus equivalent to
my #line;
if (#line = <INFILE>) { ...
Array gymnastics
After you read in the lines, you try to do some manual chomping. Here I remove all trailing whitspaces in #line, in a single line:
s/\s+$// foreach #line;
And here, I remove all non-leading whitespaces (what your regex is doing in fact):
s/(?<!^)\s//g foreach #line;
To stuff an element alternatingly into two arrays, this might work as well:
for my $i (0 .. $##line) {
if ($i % 2) {
push #array1, shift #line;
} else {
push #array2, shift #line;
}
}
or
my $i = 0;
while (#line) {
push ($i++ % 2 ? #array1 : #array2), shift #line
}
Manual bookkeeping of array indices is messy and error-prone.
Your for-loop could be written mor idiomatic as
for my $i (reverse 0 .. $header_size)
Do note that declaring $header_size inside the loop initialisation is possible if it was not declared before, but it will yield the undef value, therefore you assigned undef to $i which leads to some of the error messages, as undef should not be used in arithemtic operations. Assignments always assigns the right side to the left side.