Perl Function Not Running - perl

My Perl code:
use strict;
use warnings;
my $filename = 'data.txt';
open(my $fh, '<:encoding(UTF-8)', $filename)
or die "Could not open file '$filename' $!";
while (my $row = <$fh>) {
chomp $row;
#fields = split(/:/, $row);
if($fields[7] eq "?" || $fields[7] eq NULL)
print "$row\n";
}
while (my $row = <$fh>) {
chomp $row;
#fields = split(/:/, $row);
print '$fields: ' , replace($fields[1],"",$fields[1]) , "\n";}
}
while (my $row = <$fh>) {
chomp $row;
#fields = split(/:/, $row);
if (index($fields[2], "-") != -1) {
print "$row\n";
}
while (my $row = <$fh>) {
chomp $row;
#fields = split(/:/, $row);
if($fields[7] eq "Voyager2")
print "$row\n";
}
Error Message:
"my" variable $row masks earlier declaration in same statement at main.pl line 24.
"my" variable $row masks earlier declaration in same statement at main.pl line 25.
Global symbol "#fields" requires explicit package name (did you forget to declare "my #fields"?) at main.pl line 10.
Global symbol "#fields" requires explicit package name (did you forget to declare "my #fields"?) at main.pl line 11.
Global symbol "#fields" requires explicit package name (did you forget to declare "my #fields"?) at main.pl line 11.
syntax error at main.pl line 12, near ")
print"
Global symbol "#fields" requires explicit package name (did you forget to declare "my #fields"?) at main.pl line 18.
Global symbol "#fields" requires explicit package name (did you forget to declare "my #fields"?) at main.pl line 19.
Global symbol "#fields" requires explicit package name (did you forget to declare "my #fields"?) at main.pl line 19.
Unmatched right curly bracket at main.pl line 20, at end of line
syntax error at main.pl line 20, near "}"
Can't redeclare "my" in "my" at main.pl line 30, near "(my"
main.pl has too many errors.
Not sure where I went wrong and how to fix errors

You've got some messed up indentation there that is masking your issues. Lets fix that indentation and highlight the problem. Skipping the error free parts.
while (my $row = <$fh>) {
chomp $row;
#fields = split(/:/, $row); # #fields undeclared
if($fields[7] eq "?" || $fields[7] eq NULL) # missing curly braces { } and bareword NULL needs quotes
print "$row\n";
}
while (my $row = <$fh>) {
chomp $row;
#fields = split(/:/, $row);
print '$fields: ' , replace($fields[1],"",$fields[1]) , "\n";
} # EXTRA curly brace here
}
while (my $row = <$fh>) {
chomp $row;
#fields = split(/:/, $row);
if (index($fields[2], "-") != -1) {
print "$row\n";
}
# MISSING curly brace here
while (my $row = <$fh>) { # this loop is now part of the previous one, and the
chomp $row; # duplicate variable errors start
#fields = split(/:/, $row);
if($fields[7] eq "Voyager2") print "$row\n"; # MISSING curly braces
}
So what happens if we fix all these things? Well, we get this, which produces no errors.
my $filename = 'data.txt';
open(my $fh, '<:encoding(UTF-8)', $filename)
or die "Could not open file '$filename' $!";
while (my $row = <$fh>) {
chomp $row;
my #fields = split(/:/, $row);
if($fields[7] eq "?" || $fields[7] eq "NULL") {
print "$row\n";
}
}
while (my $row = <$fh>) {
chomp $row;
my #fields = split(/:/, $row);
print '$fields: ' , replace($fields[1],"",$fields[1]) , "\n";
}
while (my $row = <$fh>) {
chomp $row;
my #fields = split(/:/, $row);
if (index($fields[2], "-") != -1) {
print "$row\n";
}
}
while (my $row = <$fh>) {
chomp $row;
my #fields = split(/:/, $row);
if($fields[7] eq "Voyager2") {
print "$row\n";
}
}
But what is it that we have? This code will not work as expected. You have duplicated 4 while loops, and the first loop will exhaust the file handle, leaving the other loops without lines to process. Meaning they will never be executed. Unfortunately, since it is hard to say what you were trying to do here, it's hard to say how to fix it.
My guess is that you thought you could read lines in order with a while loop. Perhaps last is what you are looking for, to skip out of the while loop when you find a condition. Perhaps while is wrong for you, and you meant to take line 1, then line 2, etc. Then you need to remove all the while loops.
Without more information, the TL;DR is: Your code does not work.

Related

Perl error Use of uninitialized value $_ in substitution (s///)

Greeting I try to read a file into hash of hash following this tutorial.
http://docstore.mik.ua/orelly/perl/prog3/ch09_04.htm
My text input file is
event_a1_x1: email1=xxx#gmail.com email2=yyy#gmail.com email1_cnt=3
event_a1_x2: email1=xxx#gmail.com email2=yyy#gmail.com email1_cnt=3
event_b2_y1: email1=xxx#gmail.com email2=yyy#gmail.com email1_cnt=3
event_b2_y2: email1=xxx#gmail.com email2=yyy#gmail.com email1_cnt=3
event_c3_z1: email1=xxx#gmail.com email2=yyy#gmail.com email1_cnt=3
event_c3_z2: email1=xxx#gmail.com email2=yyy#gmail.com email1_cnt=3
My code is
#!/usr/bin/perl
use strict;
use warnings;
my $file = $ARGV[0] or die "Need to get config file on the command line\n";
open(my $data, '<', $file) or die "Could not open '$file' $!\n";
my %HoH;
#open FILE, "filename.txt" or die $!;
my $key;
my $value;
my $who;
my $rec;
my $field;
while ( my $line = <$data>) {
print $line;
next unless (s/^(.*?):\s*//);
$who = $1;
#print $who;
$rec = {};
$HoH{$who} = $rec;
for $field ( split ) {
($key, $value) = split /=/, $field;
$rec->{$key} = $value;
}
}
I keep getting this error...
Use of uninitialized value $_ in substitution (s///) at ./read_config.pl line 18, <$data> line 1.
This is about when $_, "the default input and pattern-searching space", is set and used.
In while (<$fh>), what is read from the filehandle is assigned to $_. Then your regex s/// and print and split can use it. See General Variables in perlvar.
However, once we specifically assign to a variable, while (my $line = <$fh>), this deal is off and $_ is not set. So when you later use the regex substitution in a way that relies on $_ the variable is found uninitialized.
Either consistently use the default $_, or (consistently) don't. So, either
while (<$fh>) {
print;
# same as posted
}
or
while (my $line = <$fh>) {
# ...
next unless $line =~ s/^(.*?):\s*//;
# ...
foreach my $field (split ' ', $line) {
# ...
}
}
There is quite a bit more that can be improved in the code, but that would take us elsewhere.

Add a line after every string match

I have a sample file here http://pastebin.com/m5m40nGF
What I want to do is add a line after every instance of protein_id.
protein_id always has the same pattern:
TAB-TAB-TAB-protein_id-TAB-gnl|CorradiLab|M715_#SOME_NUMBER
What I need to do is to add this after every line of protein_id:
TAB-TAB-TAB-transcript_id-TAB-gnl|CorradiLab|M715_mRNA_#SOME_NUMBER
The catch is that #SOME_NUMBER has to stay the same.
In the first case, it would look like this:
94 1476 CDS
protein_id gnl|CorradiLab|M715_ECU01_0190
transcript_id gnl|CorradiLab|M715_mRNA_ECU01_0190
product serine hydroxymethyltransferase
label serine hydroxymethyltransferase
Thanks! Adrian
I tried a perl solution, but I get an error.
open(IN, $in); while(<IN>){
print $_;
if ($_ ~= /gnl\|CorradiLab\|/) {
$_ =~ s/tprotein_id/transcript_id/;
print $_;
}
}
Error:
syntax error at test.pl line 3, near "$_ ~"
syntax error at test.pl line 7, near "}"
Execution of test.pl aborted due to compilation errors.
The following perl script worked
my $in=shift;
open(IN, $in); while(<IN>){
print $_;
if ($_ =~ /gnl\|CorradiLab\|/) {
my $tmp = $_;
$tmp =~ s/protein_id/transcript_id/;
print $tmp;
}
}
Offering an update on existing answer because I feel it can be improved further:
Generally - the precise problem in the OP is this line:
if ($_ ~= /gnl\|CorradiLab\|/) {
Because you've got ~= not =~. That's what syntax error at test.pl line 3, near "$_ ~" is trying to tell you.
I would offer that improving on:
my $in=shift;
open(IN, $in); while(<IN>){
print $_;
if ($_ =~ /gnl\|CorradiLab\|/) {
my $tmp = $_;
$tmp =~ s/protein_id/transcript_id/;
print $tmp;
}
}
while ( my $tmp = <IN> ) { skips the need to assign $_.
3 argument open with lexical filehandle is preferable. E.g. open ( my $in, "<", "$input_filename" ) or die $!; (You should test whether the open worked too)
Explicit open may well be unnecessary if you're just reading a filename from command line. Using <> either reads filenames (opening and processing) or STDIN, which means your script becomes a bit more versatile.
Thus I would rewrite as:
#!/usr/bin/perl
use strict;
use warnings;
while ( my $line = <> ) {
print $line;
if ( $line =~ /gnl\|CorradiLab\|/ ) {
$line =~ s/protein_id/transcript_id/;
print $line;
}
}
Or alternatively:
#!/usr/bin/perl
use strict;
use warnings;
while (<>) {
print;
if (m/gnl\|CorradiLab\|/) {
s/protein_id/transcript_id/;
print;
}
}

Add counter to if statement

How can I add a counter to this statement.
# go through each reference file
for my $file (#reference_files)
{
open my $ref, "<", $file or die "Can't open reference file '$file': $!";
while (my $line = <$ref>)
{
chomp $line;
my ($scaffold, undef, $type, $org_snp, $new_snp, undef, undef, undef, $info) = split /\t/, $line;
next if not $scaffold =~ /^KB/;
next if not $type =~ /^GENE/i;
my ($transcript_id, $gene_name, $auto) = split /[;][ ]/, $info;
$gene_name = $1 if $gene_name =~ /["]([^"]*)["]/;
if (my $matching_genes = $genes{$scaffold})
{
say join "\t", $gene_name, $_ for values %$matching_genes;
}
}
say "###";
}
I would like the script to additionally count all $matching_genes. Is there a way to incorporate this? I've been unsuccessful with standard counters (i.e. $i++) as it's pulling all values in the hash.
You can have a global counter variable on the top intialized to 0 before your for loop, say:
my $counter = 0;
# go through each reference file
for my $file (#reference_files)
# ... Rest of your code ...
Then, you can increment $counter inside of the if statement where $matching_genes is assigned:
if (my $matching_genes = $genes{$scaffold})
{
$counter++;
say join "\t", $gene_name, $_ for values %$matching_genes;
}
my $count=0;
# go through each reference file
for my $file (#reference_files)
{
open my $ref, "<", $file or die "Can't open reference file '$file': $!";
while (my $line = <$ref>)
{
chomp $line;
my ($scaffold, undef, $type, $org_snp, $new_snp, undef, undef, undef, $info) = split /\t/, $line;
next if not $scaffold =~ /^KB/;
next if not $type =~ /^GENE/i;
my ($transcript_id, $gene_name, $auto) = split /[;][ ]/, $info;
$gene_name = $1 if $gene_name =~ /["]([^"]*)["]/;
if (my $matching_genes = $genes{$scaffold})
{
say join "\t", $gene_name, $_ for values %$matching_genes;
$count =+ scalar(keys %$matching_genes);
}
}
say "###";
}
print "total: $count\n";

How to extract the last element of a string and use it to grow an array inside a loop

I have a dataset like this:
10001;02/07/98;TRIO;PI;M^12/12/59^F^^SP^09/12/55
;;;;;M1|F1|SP1;11;10;12;10;12;11;1.82;D16S539
;;;;;M1|F1|SP1;8;8;8;8;10;8;3.45;D7S820
;;;;;M1|F1|SP1;14;12;12;11;14;11;1.57;D13S317
;;;;;M1|F1|SP1;12;12;13;12;13;8;3.27;D5S818
;;;;;M1|F1|SP1;12;12;12;12;12;8;1.51;CSF1PO
;;;;;M1|F1|SP1;8;11;11;11;11;8;1.79;TPOX
;;;;;M1|F1|SP1;6;9;9;6;8;6;1.31;TH01
I'm trying to extract the last element of the lines which does not start with a number, i.e. all lines except the first one. I want to put these values inside an array called #markers.
I'm trying that by the following code:
#!usr/bin/perl
use warnings;
use strict;
open FILE, 'test' || die $!;
while (my $line = <FILE>) {
my #fields = (split /;/), $line;
if ($line !~ m/^[0-9]+/) {
my #markers = splice #fields, 0, #fields - 1;
}
}
But that does not work. Can anyone help please?
Thanks
You create a new variable named #markers every pass of the loop.
my #fields = (split /;/), $line; means (my #fields = (split /;/, $_)), $line;. You meant my #fields = (split /;/, $line);
'test' || die $! is the same as just 'test'.
use strict;
use warnings;
open my $FILE, '<', 'test'
or die $!;
my #markers;
while (<$FILE>) {
chomp;
next if /^\s*\z/; # Skip blank lines.
my #fields = split /;/;
push #markers, $fields[-1]
if $fields[0] eq '';
}
You aren't using function split() correctly. I have fixed it in the code below and printed the values:
#!/usr/bin/perl
use warnings;
use strict;
open FILE, 'test' || die $!;
while (my $line = <FILE>) {
my #fields = split( /;/, $line);
if ($line !~ m/^[0-9]+/) {
print "$fields[-1]";
# my #markers = splice #fields, 0, #fields - 1;
}
}

Using perl, how do I search a text file for _NN (at the end of a word) and print the word in front?

This gives the whole line:
#!/usr/bin/perl
$file = 'output.txt';
open(txt, $file);
while($line = <txt>) {
print "$line" if $line =~ /_NN/;
}
close(txt);
#!/usr/bin/perl
use strict;
use warnings FATAL => "all";
binmode(STDOUT, ":utf8") || die;
my $file = "output.txt";
open(TEXT, "< :utf8", $file) || die "Can't open $file: $!";
while(<TEXT>) {
print "$1\n" while /(\w+)_NN\b/g;
}
close(TEXT) || die "Can't close $file: $!";
Your answer script reads a bit awkwardly, and has a couple of potential errors. I'd rewrite the main logic loop like so:
foreach my $line (grep { /expend_VB/ } #sentences) {
my #nouns = grep { /_NN/ } split /\s+/, $line;
foreach my $word (#nouns) {
$word =~ s/_NN//;
print "$word\n";
}
print "$line\n" if scalar(#nouns);
}
You need to put the my declaration inside the loop - otherwise it will persist longer than you want it to, and could conceivably cause problems later.
foreach is a more common perl idiom for iterating over a list.
print "$1" if $line =~ /(\S+)_NN/;
#!/usr/bin/perl
use strict;
use warnings FATAL => "all";
my $search_key = "expend"; ## CHANGE "..." to <>
open(my $tag_corpus, '<', "ch13tagged.txt") or die $!;
my #sentences = <$tag_corpus>; # This breaks up each line into list
my #words;
for (my $i=0; $i <= #sentences; $i++) {
if ( defined( $sentences[$i] ) and $sentences[$i] =~ /($search_key)_VB.*/i) {
#words = split /\s/,$sentences[$i]; ## \s is a whitespace
for (my $j=0; $j <= #words; $j++) {
#FILTER if word is noun:
if ( defined( $words[$j] ) and $words[$j] =~ /_NN/) {
#PRINT word and sentence:
print "**",split(/_\S+/,$words[$j]),"**", "\n";
print split(/_\S+/,$sentences[$i]), "\n"
}
} ## put print sentences here to print each sentence after all the nouns inside
}
}
close $tag_corpus || die "Can't close $tag_corpus: $!";