How can I avoid warnings in Perl? - perl

I have a small piece of code for printing the contents in a text file like this,
use strict;
use warnings;
open (FILE, "2.txt") || die "$!\n";
my $var = <FILE>;
while ($var ne "")
{
print "$var";
$var = <FILE>;
}
Text file is,
line 1
line 2
line 3
After running the code i am getting a warning like this,
line 1
line 2
line 3
Use of uninitialized value $var in string ne at del.pl line 10, <FILE> line 3.
How to overcome this warning.

The common idiom for reading from a file is this:
open my $fh, '<', $file or die $!;
while (defined(my $line = <$fh>)) {
print $line, "\n";
}
Although the while loop implicitly tests for whether the result of the assignment is defined, it's better to do the test explicitly for clarity.

I always use:
while(<FILE>) {
print $_;
}
No such problems...

The quickest fix is probably to replace
while ($var ne "")
with
while (defined $var)

Related

how to combine the code to make the output is on the same line?

Can you help me to combine both of these progeam to display the output in a row with two columns? The first column is for $1 and the second column is $2.
Kindly help me to solve this. Thank you :)
This is my code 1.
#!/usr/local/bin/perl
#!/usr/bin/perl
use strict ;
use warnings ;
use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
my $input = "par_disp_fabric.all_max_lowvcc_qor.rpt.gz";
my $output = "par_disp_fabric.all_max_lowvcc_qor.txt";
gunzip $input => $output
or die "gunzip failed: $GunzipError\n";
open (FILE, '<',"$output") or die "Cannot open $output\n";
while (<FILE>) {
my $line = $_;
chomp ($line);
if ($line=~ m/^\s+Timing Path Group \'(\S+)\'/) {
$line = $1;
print ("$1\n");
}
}
close (FILE);
This is my code 2.
my $input = "par_disp_fabric.all_max_lowvcc_qor.rpt.gz";
my $output = "par_disp_fabric.all_max_lowvcc_qor.txt";
gunzip $input => $output
or die "gunzip failed: $GunzipError\n";
open (FILE, '<',"$output") or die "Cannot open $output\n";
while (<FILE>) {
my $line = $_;
chomp ($line);
if ($line=~ m/^\s+Levels of Logic:\s+(\S+)/) {
$line = $1;
print ("$1\n");
}
}
close (FILE);
this is my output for code 1 which contain 26 line of data:
**async_default**
**clock_gating_default**
Ddia_link_clk
Ddib_link_clk
Ddic_link_clk
Ddid_link_clk
FEEDTHROUGH
INPUTS
Lclk
OUTPUTS
VISA_HIP_visa_tcss_2000
ckpll_npk_npkclk
clstr_fscan_scanclk_pulsegen
clstr_fscan_scanclk_pulsegen_notdiv
clstr_fscan_scanclk_wavegen
idvfreqA
idvfreqB
psf5_primclk
sb_nondet4tclk
sb_nondetl2tclk
sb_nondett2lclk
sbclk_nondet
sbclk_sa_det
stfclk_scan
tap4tclk
tapclk
The output code 1 also has same number of line.
paste is useful for this: assuming your shell is bash, then using process substitutions
paste <(perl script1.pl) <(perl script2.pl)
That emits columns separated by a tab character. For prettier output, you can pipe the output of paste to column
paste <(perl script1.pl) <(perl script2.pl) | column -t -s $'\t'
And with this, you con't need to try and "merge" your perl programs.
To combine the two scripts and to output two items of data on the same line, you need to hold on until the end of the file (or until you have both data items) and then output them at once. So you need to combine both loops into one:
#!/usr/bin/perl
use strict ;
use warnings ;
use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
my $input = "par_disp_fabric.all_max_lowvcc_qor.rpt.gz";
my $output = "par_disp_fabric.all_max_lowvcc_qor.txt";
gunzip $input => $output
or die "gunzip failed: $GunzipError\n";
open (FILE, '<',"$output") or die "Cannot open $output\n";
my( $levels, $timing );
while (<FILE>) {
my $line = $_;
chomp ($line);
if ($line=~ m/^\s+Levels of Logic:\s+(\S+)/) {
$levels = $1;
}
if ($line=~ m/^\s+Timing Path Group \'(\S+)\'/) {
$timing = $1;
}
}
print "$levels, $timing\n";
close (FILE);
You still haven't given us one vital piece of information - what does the input data looks like. Most importantly, are the two pieces of information you're looking for on the same line?
[Update: Looking more closely at your regexes, I see it's possible for both pieces of information to be on the same line - as they are both supposed to be the first item on the line. It would be helpful if you were clearer about that in your question.]
I think this will do the right thing, no matter what the answer to your question is. I've also added the improvements I suggested in my answer to your previous question:
#!/usr/bin/perl
use strict ;
use warnings ;
use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
my $zipped = "par_disp_fabric.all_max_lowvcc_qor.rpt.gz";
my $unzipped = "par_disp_fabric.all_max_lowvcc_qor.txt";
gunzip $zipped => $unzipped
or die "gunzip failed: $GunzipError\n";
open (my $fh, '<', $unzipped) or die "Cannot open '$unzipped': $!\n";
my ($levels, $timing);
while (<$fh>) {
chomp;
if (m/^\s+Levels of Logic:\s+(\S+)/) {
$levels = $1;
}
if (m/^\s+Timing Path Group \'(\S+)\'/) {
$timing = $1;
}
# If we have both values, then print them out and
# set the variables to 'undef' for the next iteration
if ($levels and $timing) {
print "$levels, $timing\n";
undef $levels;
undef $timing;
}
}
close ($fh);

Add a line after every string match

I have a sample file here http://pastebin.com/m5m40nGF
What I want to do is add a line after every instance of protein_id.
protein_id always has the same pattern:
TAB-TAB-TAB-protein_id-TAB-gnl|CorradiLab|M715_#SOME_NUMBER
What I need to do is to add this after every line of protein_id:
TAB-TAB-TAB-transcript_id-TAB-gnl|CorradiLab|M715_mRNA_#SOME_NUMBER
The catch is that #SOME_NUMBER has to stay the same.
In the first case, it would look like this:
94 1476 CDS
protein_id gnl|CorradiLab|M715_ECU01_0190
transcript_id gnl|CorradiLab|M715_mRNA_ECU01_0190
product serine hydroxymethyltransferase
label serine hydroxymethyltransferase
Thanks! Adrian
I tried a perl solution, but I get an error.
open(IN, $in); while(<IN>){
print $_;
if ($_ ~= /gnl\|CorradiLab\|/) {
$_ =~ s/tprotein_id/transcript_id/;
print $_;
}
}
Error:
syntax error at test.pl line 3, near "$_ ~"
syntax error at test.pl line 7, near "}"
Execution of test.pl aborted due to compilation errors.
The following perl script worked
my $in=shift;
open(IN, $in); while(<IN>){
print $_;
if ($_ =~ /gnl\|CorradiLab\|/) {
my $tmp = $_;
$tmp =~ s/protein_id/transcript_id/;
print $tmp;
}
}
Offering an update on existing answer because I feel it can be improved further:
Generally - the precise problem in the OP is this line:
if ($_ ~= /gnl\|CorradiLab\|/) {
Because you've got ~= not =~. That's what syntax error at test.pl line 3, near "$_ ~" is trying to tell you.
I would offer that improving on:
my $in=shift;
open(IN, $in); while(<IN>){
print $_;
if ($_ =~ /gnl\|CorradiLab\|/) {
my $tmp = $_;
$tmp =~ s/protein_id/transcript_id/;
print $tmp;
}
}
while ( my $tmp = <IN> ) { skips the need to assign $_.
3 argument open with lexical filehandle is preferable. E.g. open ( my $in, "<", "$input_filename" ) or die $!; (You should test whether the open worked too)
Explicit open may well be unnecessary if you're just reading a filename from command line. Using <> either reads filenames (opening and processing) or STDIN, which means your script becomes a bit more versatile.
Thus I would rewrite as:
#!/usr/bin/perl
use strict;
use warnings;
while ( my $line = <> ) {
print $line;
if ( $line =~ /gnl\|CorradiLab\|/ ) {
$line =~ s/protein_id/transcript_id/;
print $line;
}
}
Or alternatively:
#!/usr/bin/perl
use strict;
use warnings;
while (<>) {
print;
if (m/gnl\|CorradiLab\|/) {
s/protein_id/transcript_id/;
print;
}
}

Perl: comparing words in two files

This is my current script to try and compare the words in file_all.txt to the ones in file2.txt. It should print out any of the words in file_all that are not in file2.
I need to format these as one word per line, but that's not the more pressing issue.
I am new to Perl ... I get C and Python more but this is being a bit tricky, I know my variable assignment is off.
use strict;
use warnings;
my $file2 = "file_all.txt"; %I know my assignment here is wrong
my $file1 = "file2.txt";
open my $file2, '<', 'file2' or die "Couldn't open file2: $!";
while ( my $line = <$file2> ) {
++$file2{$line};
}
open my $file1, '<', 'file1' or die "Couldn't open file1: $!";
while ( my $line = <$file1> ) {
print $line unless $file2{$line};
}
EDIT: OH, it should ignore case... like Pie is the same as PIE when comparing. and remove apostrophes
These are the errors I am getting:
"my" variable $file2 masks earlier declaration in same scope at absent.pl line 9.
"my" variable $file1 masks earlier declaration in same scope at absent.pl line 14.
Global symbol "%file2" requires explicit package name at absent.pl line 11.
Global symbol "%file2" requires explicit package name at absent.pl line 16.
Execution of absent.pl aborted due to compilation errors.
Your error messages:
"my" variable $file2 masks earlier declaration in same scope at absent.pl line 9.
"my" variable $file1 masks earlier declaration in same scope at absent.pl line 14.
Global symbol "%file2" requires explicit package name at absent.pl line 11.
Global symbol "%file2" requires explicit package name at absent.pl line 16.
Execution of absent.pl aborted due to compilation errors.
You are assigning a file name to $file2, and then later you are using open my $file2 ... The use of my $file2 in the second case masks the use in the first case. Then, in the body of the while loop, you pretend there is a hash table %file2, but you haven't declared it at all.
You should use more descriptive variable names to avoid conceptual confusion.
For example:
my #filenames = qw(file_all.txt file2.txt);
Using variables with integer suffixes is a code smell.
Then, factor common tasks to subroutines. In this case, what you need are: 1) A function that takes a filename and returns a table of words in that file, and 2) A function that takes a filename, and a lookup table, and prints words that are in the file, but do not appear in the lookup table.
#!/usr/bin/env perl
use strict;
use warnings;
use Carp qw( croak );
my #filenames = qw(file_all.txt file2.txt);
print "$_\n" for #{ words_notseen(
$filenames[0],
words_from_file($filenames[1])
)};
sub words_from_file {
my $filename = shift;
my %words;
open my $fh, '<', $filename
or croak "Cannot open '$filename': $!";
while (my $line = <$fh>) {
$words{ lc $_ } = 1 for split ' ', $line;
}
close $fh
or croak "Failed to close '$filename': $!";
return \%words;
}
sub words_notseen {
my $filename = shift;
my $lookup = shift;
my %words;
open my $fh, '<', $filename
or croak "Cannot open '$filename': $!";
while (my $line = <$fh>) {
for my $word (split ' ', $line) {
unless (exists $lookup->{$word}) {
$words{ $word } = 1;
}
}
}
return [ keys %words ];
}
You are almost there.
The % sigil denotes a hash. You can't store a file name in a hash, you need a scalar for that.
my $file2 = 'file_all.txt';
my $file1 = 'file2.txt';
You need a hash to count the occurrences.
my %count;
To open a file, specify its name - it's stored in the scalar, do you remember?
open my $FH, '<', $file2 or die "Can't open $file2: $!";
Then, process the file line by line:
while (my $line = <$FH> ) {
chomp; # Remove newline if present.
++$count{lc $line}; # Store the lowercased string.
}
Then, open the second file, process it line by line, use lc again to get the lowercased string.
To remove apostophes, use a substitution:
$line =~ s/'//g; # Replace ' by nothing globally (i.e. everywhere).
As you have mention in your question: It should print out any of the words in file_all that are not in file2
This below small code does this:
#!/usr/bin/perl
use strict;
use warnings;
my ($file1, $file2) = qw(file_all.txt file2.txt);
open my $fh1, '<', $file1 or die "Can't open $file1: $!";
open my $fh2, '<', $file2 or die "Can't open $file2: $!";
while (<$fh1>)
{
last if eof($fh2);
my $compline = <$fh2>;
chomp($_, $compline);
if ($_ ne $compline)
{
print "$_\n";
}
}
file_all.txt:
ab
cd
ee
ef
gh
df
file2.txt:
zz
yy
ee
ef
pp
df
Output:
ab
cd
gh
The issue is the following two lines:
my %file2 = "file_all.txt";
my %file1 = "file2.txt";
Here you are assigning a single value, called a SCALAR in Perl, to a Hash (denoted by the % sigil). Hashes consist of key value pairs separated by the arrow operator (=>). e.g.
my %hash = ( key => 'value' );
Hashes expect an even number of arguments because they must be given both a key and a value. You currently only give each Hash a single value, thus this error is thrown.
To assign a value to a SCALAR, you use the $ sigil:
my $file2 = "file_all.txt";
my $file1 = "file2.txt";

How to search and replace using hash with Perl

I'm new to Perl and I'm afraid I am stuck and wanted to ask if someone might be able to help me.
I have a file with two columns (tab separated) of oldname and newname.
I would like to use the oldname as key and newname as value and store it as a hash.
Then I would like to open a different file (gff file) and replace all the oldnames in there with the newnames and write it to another file.
I have given it my best try but am getting a lot of errors.
If you could let me know what I am doing wrong, I would greatly appreciate it.
Here are how the two files look:
oldname newname(SFXXXX) file:
genemark-scaffold00013-abinit-gene-0.18 SF130001
augustus-scaffold00013-abinit-gene-1.24 SF130002
genemark-scaffold00013-abinit-gene-1.65 SF130003
file to search and replace in (an example of one of the lines):
scaffold00013 maker gene 258253 258759 . - . ID=maker-scaffold00013-augustus-gene-2.187;Name=maker-scaffold00013-augustus-gene-2.187;
Here is my attempt:
#!/usr/local/bin/perl
use warnings;
use strict;
my $hashfile = $ARGV[0];
my $gfffile = $ARGV[1];
my %names;
my $oldname;
my $newname;
if (!defined $hashfile) {
die "Usage: $0 hash_file gff_file\n";
}
if (!defined $gfffile) {
die "Usage: $0 hash_file gff_file\n";
}
###save hashfile with two columns, oldname and newname, into a hash with oldname as key and newname as value.
open(HFILE, $hashfile) or die "Cannot open $hashfile\n";
while (my $line = <HFILE>) {
chomp($line);
my ($oldname, $newname) = split /\t/;
$names{$oldname} = $newname;
}
close HFILE;
###open gff file and replace all oldnames with newnames from %names.
open(GFILE, $gfffile) or die "Cannot open $gfffile\n";
while (my $line2 = <GFILE>) {
chomp($line2);
eval "$line2 =~ s/$oldname/$names{oldname}/g";
open(OUT, ">SFrenamed.gff") or die "Cannot open SFrenamed.gff: $!";
print OUT "$line2\n";
close OUT;
}
close GFILE;
Thank you!
Your main problem is that you aren't splitting the $line variable. split /\t/ splits $_ by default, and you haven't put anything in there.
This program builds the hash, and then constructs a regex from all the keys by sorting them in descending order of length and joining them with the | regex alternation operator. The sorting is necessary so that the longest of all possible choices is selected if there are any alternatives.
Every occurrence of the regex is replaced by the corresponding new name in each line of the input file, and the output written to the new file.
use strict;
use warnings;
die "Usage: $0 hash_file gff_file\n" if #ARGV < 2;
my ($hashfile, $gfffile) = #ARGV;
open(my $hfile, '<', $hashfile) or die "Cannot open $hashfile: $!";
my %names;
while (my $line = <$hfile>) {
chomp($line);
my ($oldname, $newname) = split /\t/, $line;
$names{$oldname} = $newname;
}
close $hfile;
my $regex = join '|', sort { length $b <=> length $a } keys %names;
$regex = qr/$regex/;
open(my $gfile, '<', $gfffile) or die "Cannot open $gfffile: $!";
open(my $out, '>', 'SFrenamed.gff') or die "Cannot open SFrenamed.gff: $!";
while (my $line = <$gfile>) {
chomp($line);
$line =~ s/($regex)/$names{$1}/g;
print $out $line, "\n";
}
close $out;
close $gfile;
Why are you using an eval? And $oldname is going to be undefined in the second while loop, because the first while loop you redeclare them in that scope (even if you used the outer scope, it would store the very last value that you processed, which wouldn't be helpful).
Take out the my $oldname and my $newname at the top of your script, it is useless.
Take out the entire eval line. You need to repeat the regex for each thing you want to replace. Try something like:
$line2 =~ s/$_/$names{$_}/g for keys %names;
Also see Borodin's answer. He made one big regex instead of a loop, and caught your lack of the second argument to split.

How to read specific lines from file and store in an array using perl?

How can i read/store uncommented lines from file into an array ?
file.txt looks like below
request abcd uniquename "zxsder,azxdfgt"
request abcd uniquename1 "nbgfdcbv.bbhgfrtyujk"
request abcd uniquename2 "nbcvdferr,nscdfertrgr"
#request abcd uniquename3 "kdgetgsvs,jdgdvnhur"
#request abcd uniquename4 "hvgsfeyeuee,bccafaderryrun"
#request abcd uniquename5 "bccsfeueiew,bdvdfacxsfeyeueiei"
Now i have to read/store the uncommented lines (first 3 lines in this script) into an array. is it possible to use it by pattern matching with string name or any regex ? if so, how can i do this ?
This below code stores all the lines into an array.
open (F, "test.txt") || die "Could not open test.txt: $!\n";
#test = <F>;
close F;
print #test;
how can i do it for only uncommented lines ?
If you know your comments will contain # at the beginning you can use
next if $_ =~ m/^#/
Or use whatever variable you have to read each line instead of $_
This matches # signs at the beginning of the line.
As far as adding the others to an array you can use push (#arr, $_)
#!/usr/bin/perl
# Should always include these
use strict;
use warnings;
my #lines; # Hold the lines you want
open (my $file, '<', 'test.txt') or die $!; # Open the file for reading
while (my $line = <$file>)
{
next if $line =~ m/^#/; # Look at each line and if if isn't a comment
push (#lines, $line); # we will add it to the array.
}
close $file;
foreach (#lines) # Print the values that we got
{
print "$_\n";
}
You could do:
push #ary,$_ unless /^#/;END{print join "\n",#ary}'
This skips any line that begins with #. Otherwise the line is added to an array for later use.
The smallest change to your original program would probably be:
open (F, "test.txt") || die "Could not open test.txt: $!\n";
#test = grep { $_ !~ /^#/ } <F>;
close F;
print #test;
But I'd highly recommend rewriting that slightly to use current best practices.
# Safety net
use strict;
use warnings;
# Lexical filehandle, three-arg open
open (my $fh, '<', 'test.txt') || die "Could not open test.txt: $!\n";
# Declare #test.
# Don't explicitly close filehandle (closed automatically as $fh goes out of scope)
my #test = grep { $_ !~ /^#/ } <$fh>;
print #test;