Perl : Name "main::IN" used only once, but it is actually used - perl

I writing a short perl script that reads in a file. See tmp.txt:
1 gene_id "XLOC_000001"; gene_name "DDX11L1"; oId
1 gene_id "XLOC_000001"; gene_name "DDX11L1"; oId
1 gene_id "XLOC_000001"; gene_name "DDX11L1"; oId
1 gene_id "XLOC_000001"; gene_name "DDX11L1"; oId
My perl program, convert.pl is :
use warnings;
use strict;
use autodie; # die if io problem with file
my $line;
my ($xloc, $gene, $ens);
open (IN, "tmp.txt")
or die ("open 'tmp.txt' failed, $!\n");
while ($line = <IN>) {
($xloc, $gene) = ($line =~ /gene_id "([^"]+)".*gene_name "([^"]+)"/);
print("$xloc $gene\n");
}
close (IN)
or warn $! ? "ERROR 1" : "ERROR 2";
It outputs:
Name "main::IN" used only once: possible typo at ./convert.pl line 8.
XLOC_000001 DDX11L1
XLOC_000001 DDX11L1
XLOC_000001 DDX11L1
XLOC_000001 DDX11L1
I used IN, so I don't understand the Name "main::IN" used... warning. Why is it complaining?

This is mentioned under BUGS section of autodie
"Used only once" warnings can be generated when autodie or Fatal is used with package filehandles (eg, FILE). Scalar filehandles are strongly recommended instead.
use diagnostics; says:
Name "main::IN" used only once: possible typo at test.pl line 9 (#1)
(W once) Typographical errors often show up as unique variable names.
If you had a good reason for having a unique name, then just mention
it again somehow to suppress the message. The our declaration is also
provided for this purpose.
NOTE: This warning detects package symbols that have been used only
once. This means lexical variables will never trigger this warning.
It also means that all of the package variables $c, #c, %c, as well as
*c, &c, sub c{}, c(), and c (the filehandle or format) are considered the same; if a program uses $c only once but also uses any of the
others it will not trigger this warning. Symbols beginning with an
underscore and symbols using special identifiers (q.v. perldata) are
exempt from this warning.
So if you use lexical filehandle then it will not warn.
use warnings;
use strict;
use autodie; # die if io problem with file
use diagnostics;
my $line;
my ($xloc, $gene, $ens);
open (my $in, "<", "tmp.txt")
or die ("open 'tmp.txt' failed, $!\n");
while ($line = <$in>) {
($xloc, $gene) = ($line =~ /gene_id "([^"]+)".*gene_name "([^"]+)"/);
print("$xloc $gene\n");
}
close ($in)
or warn $! ? "ERROR 1" : "ERROR 2";

I'm pretty sure this is because of autodie.
I don't know exactly why, but if you remove it, it goes away.
If you read perldoc autodie you'll see:
BUGS ^
"Used only once" warnings can be generated when autodie or Fatal is used with package filehandles (eg, FILE). Scalar filehandles are strongly recommended instead.
I'd suggest that's because of how the or die is being handled, compared to autodie trying to handle it.
However I'd also suggest it would be much better style to use a 3 argument open:
open ( my $input, '<', 'tmp.txt');
And either autodie or or die. I must confess, I'm not really sure which way around the two would be applied if your process did fail the open.

Related

Understanding lexical scoping of "use open ..." of Perl

use open qw( :encoding(UTF-8) :std );
Above statement seems to be effective in its lexical scope only and should not affect outside of it's scope. But I have observed the following.
$ cat data
€
#1
$ perl -e '
open (my $fh, "<encoding(UTF-8)", "data");
print($_) while <$fh>;'
Wide character in print at -e line 1, <$fh> line 1.
€
The Wide character ... warning is perfect here. But
#2
$ perl
my ($fh, $row);
{
use open qw( :encoding(UTF-8) :std );
open ($fh, "<", "data");
}
$row = <$fh>;
chomp($row);
printf("%s (0x%X)", $row, ord($row));
€ (0x20AC)
Does not show the wide character warning!! Here is whats going on here imo
We are using open pragma to set the IO streams to UTF-8, including STDOUT.
Opening the file inside the same scope. It reads the character as multibyte char.
But printing outside the scope. The print statement should show "Wide character" warning, but it is not. Why?
Now look at the following, a little variation
#3
my ($fh, $row);
{
use open qw( :encoding(UTF-8) :std );
}
open ($fh, "<", "data");
$row = <$fh>;
chomp($row);
printf("%s (0x%X)", $row, ord($row));
⬠(0xE2)
Now this time since the open statement is out of the lexical scope, the open opened the file in non utf-8 mode.
Does this mean use open qw( :encoding(UTF-8) :std ); statement changes the STDOUT globally but STDIN within lexical scope?
You aren't using STDIN. You're opening a file with an explicit encoding (except for your last example) and reading from that.
The use open qw(:std ...) affects the standard file handles, but you're only using standard output. When you don't use that and print UTF-8 data to standard output, you get the warning.
In your last example, you don't read the data with an explicit encoding, so when you print it to standard output, it's already corrupted.
That's the trick of encodings no matter what they are. Every part of the process has to be correct.
If you want use open to affect all file handles, you have to import it differently. There are several examples in the top of the documentation.
Unfortunately, the open qw(:std) pragma does not seem to behave as a lexical pragma since it changes the IO layers associated with the standard handles STDIN, STDOUT and STDERR globally. Even code earlier in source file is affected since the use statement happens at compile time. So the following
say join ":", PerlIO::get_layers(\*STDIN);
{
use open qw( :encoding(UTF-8) :std );
}
prints ( on my linux platform ) :
unix:perlio:encoding(utf-8-strict):utf8
whereas without the use open qw( :encoding(UTF-8) :std ) it would just print
unix:perlio.
A way to not affect the global STDOUT for example is to duplicate the handle within a lexical scope and then add IO layers to the duplicate handle within that scope:
use feature qw(say);
use strict;
use warnings;
use utf8;
my $str = "€";
say join ":", PerlIO::get_layers(\*STDOUT);
{
open ( my $out, '>&STDOUT' ) or die "Could not duplicate stdout: $!";
binmode $out, ':encoding(UTF-8)';
say $out $str;
}
say join ":", PerlIO::get_layers(\*STDOUT);
say $str;
with output:
unix:perlio
€
unix:perlio
Wide character in say at ./p.pl line 16.
€

Perl wrongly complaining about Name "main::FILE" used only once

I simplified my program to the following trivial snippet and I'm still getting the message
Name "main::FILE" used only once: possible typo...
#!/usr/bin/perl -w
use strict;
use autodie qw(open close);
foreach my $f (#ARGV) {
local $/;
open FILE, "<", $f;
local $_ = <FILE>; # <--- HERE
close FILE;
print $_;
}
which obviously isn't true as it gets used three times. For whatever reason, only the marked occurrence counts.
I am aware about nicer ways to open a file (using a $filehandle), but it doesn't pay for short script, does it? So how can I get rid of the wrong warning?
According to the documentation for autodie:
BUGS
"Used only once" warnings can be generated when autodie or Fatal is used with package filehandles (eg, FILE ). Scalar filehandles are strongly recommended instead.
I get the warning on Perl 5.10.1, but not 5.16.3, so there may be something else going on as well.

Perl script to parse a text file and match a string

I'm editing my question to add more details
The script executes the command and redirects the output to a text file.
The script then parses the text file to match the following string " Standard 1.1.1.1"
The output in the text file is :
Host Configuration
------------------
Profile Hostname
-------- ---------
standard 1.1.1.1
standard 1.1.1.2
The code works if i search for either 1.1.1.1 or standard . When i search for standard 1.1.1.1 together the below script fails.
this is the error that i get "Unable to find string: standard 172.25.44.241 at testtest.pl
#!/usr/bin/perl
use Net::SSH::Expect;
use strict;
use warnings;
use autodie;
open (HOSTRULES, ">hostrules.txt") || die "could not open output file";
my $hos = $ssh->exec(" I typed the command here ");
print HOSTRULES ($hos);
close(HOSTRULES);
sub find_string
{
my ($file, $string) = #_;
open my $fh, '<', $file;
while (<$fh>) {
return 1 if /\Q$string/;
}
die "Unable to find string: $string";
}
find_string('hostrules.txt', 'standard 1.1.1.1');
Perhaps write a function:
use strict;
use warnings;
use autodie;
sub find_string {
my ($file, $string) = #_;
open my $fh, '<', $file;
while (<$fh>) {
return 1 if /\Q$string/;
}
die "Unable to find string: $string";
}
find_string('output.txt', 'object-cache enabled');
Or just slurp the entire file:
use strict;
use warnings;
use autodie;
my $data = do {
open my $fh, '<', 'output.txt';
local $/;
<$fh>;
};
die "Unable to find string" if $data !~ /object-cache enabled/;
You're scanning a file for a particular string. If that string is not found in that file, you want an error thrown. Sounds like a job for grep.
use strict;
use warnings;
use features qw(say);
use autodie;
use constant {
OUTPUT_FILE => 'output.txt',
NEEDED_STRING => "object-cache enabled",
};
open my $out_fh, "<", OUTPUT_FILE;
my #output_lines = <$out_fh>;
close $out_fh;
chomp #output_lines;
grep { /#{[NEEDED_STRING]}/ } #output_lines or
die qq(ERROR! ERROR! ERROR!); #Or whatever you want
The die command will end the program and exit with a non-zero exit code. The error will be printed on STDERR.
I don't know why, but using qr(object-cache enabled), and then grep { NEEDED_STRING } didn't seem to work. Using #{[...]} allows you to interpolate constants.
Instead of constants, you might want to be able to pass in the error string and the name of the file using GetOptions.
I used the old fashion <...> file handling instead of IO::File, but that's because I'm an old fogy who learned Perl back in the 20th century before it was cool. You can use IO::File which is probably better and more modern.
ADDENDUM
Any reason for slurping the entire file in memory? - Leonardo Herrera
As long as the file is reasonably sized (say 100,000 lines or so), reading the entire file into memory shouldn't be that bad. However, you could use a loop:
use strict;
use warnings;
use features qw(say);
use autodie;
use constant {
OUTPUT_FILE => 'output.txt',
NEEDED_STRING => qr(object-cache enabled),
};
open my $out_fh, "<", OUTPUT_FILE;
my $output_string_found; # Flag to see if output string is found
while ( my $line = <$out_fh> ) {
if ( $line =~ NEEDED_STRING ){
$output_string_found = "Yup!"
last; # We found the string. No more looping.
}
}
die qq(ERROR, ERROR, ERROR) unless $output_string_found;
This will work with the constant NEEDED_STRING defined as a quoted regexp.
perl -ne '/object-cache enabled/ and $found++; END{ print "Object cache disabled\n" unless $found}' < input_file
This just reads the file a line at a time; if we find the key phrase, we increment $found. At the end, after we've read the whole file, we print the message unless we found the phrase.
If the message is insufficient, you can exit 1 unless $found instead.
I suggest this because there are two things to learn from this:
Perl provides good tools for doing basic filtering and data munging right at the command line.
Sometimes a simpler approach gets a solution out better and faster.
This absolutely isn't the perfect solution for every possible data extraction problem, but for this particular one, it's just what you need.
The -ne option flags tell Perl to set up a while loop to read all of standard input a line at a time, and to take any code following it and run it into the middle of that loop, resulting in a 'run this pattern match on each line in the file' program in a single command line.
END blocks can occur anywhere and are always run at the end of the program only, so defining it inside the while loop generated by -n is perfectly fine. When the program runs out of lines, we fall out the bottom of the while loop and run out of program, so Perl ends the program, triggering the execution of the END block to print (or not) the warning.
If the file you are searching contained a string that indicated the cache was disabled (the condition you want to catch), you could go even shorter:
perl -ne '/object-cache disabled/ and die "Object cache disabled\n"' < input_file
The program would scan the file only until it saw the indication that the cache was disabled, and would exit abnormally at that point.
First, why are you using Net::SSH::Expect? Are you executing a remote command? If not, all you need to execute a program and wait for its completion is system.
system("cmd > file.txt") or die "Couldn't execute: $!";
Second, it appears that what fails is your regular expression. You are searching for the literal expression standard 1.1.1.1 but in your sample text it appears that the wanted string contains either tabs or several spaces instead of a single space. Try changing your call to your find_string function:
find_string('hostrules.txt', 'standard\s+1.1.1.1'); # note '\s+' here

Replace a number incrementally in same line using Perl

I have some XML data like this
<!--Q1: some text--><!--Q1: some text--><!--Q1: some text-->
I want to replace this query number in order like so
<!--Q1: some text--><!--Q2: some text--><!--Q3: some text-->..
I wrote this Perl script
#!/usr/bin/perl -w
$b=1;
use strict;
open(FILE, "<text.xml") || die "File not found";
my #lines = <FILE>;
close(FILE);
my #newlines;
while<> {
$_ =~ s/<!--Q[0-9]{1,2}/<!--Q$b/g;
$b++;
push(#newlines,$_);
}
open(FILE, ">text.xml") || die "File not found";
print FILE #newlines;
but it only makes one replacement in each line.
My text:
<!--Q2: text-->
<!--Q3: text--><!--Q8: text-->
<!--Q10: text-->
output
<!--Q1: text-->
<!--Q**2**: text--><!--Q**2**: text-->
<!--Q3: text-->
There are many problems with your program
You must always use strict and use warnings as the first lines of your program
You should use lexical file handles (scalar variables) instead of global names
You should use the three-parameter form of open, and include the built-in variable $! in the die string if open fails
You should never use $a or $b as variable names. They don't help to document the program at all, and they are used internally by perl so you can't rely on their contents
You have read the entirety of the file into #lines, and then expect there to be more to read in your while loop. You have already reached end of file, so the loop is never entered
It is pointless to test for exactly one or two digits following <!--Q. If there is an occurrenece of three or more digits then the regex will still match, but only the first two digits will be replaced
There is no reasons to push the modified lines to an array and print them all later. Just print each one as you change it
Use this instead. Version 10.0 of Perl 5 is required for the \K construct in the regex. It has been around since 2007, so if you are behind with your updates then you should really get that fixed.
use strict;
use warnings;
use 5.010;
open my $in, '<', 'text.xml' or die $!;
open my $out, '>', 'newtext.xml' or die $!;
my $n = 0;
while (<$in>) {
s/<!--Q\K\d+/++$n/ge;
print $out $_;
}
output
<!--Q1: text-->
<!--Q2: text--><!--Q3: text-->
<!--Q4: text-->
Update
If you don't have version 10 of Perl 5 available (and you really should - it is six years old and a major update) then you can write the regular expression like this
s/(<!--Q)\d+/$1.++$n/ge;

Perl: Opening File

I am trying to open the file received as argument.
When i store the argument in to the global variable open works successfully.
But
If I use give make it as my open fails to open the file.
What is the reason.
#use strict;
use warnings;
#my $FILE=$ARGV[0]; #open Fails to open the file $FILE
$FILE=$ARGV[0]; #Works Fine with Global $FILE
open(FILE)
or
die "\n ". "Cannot Open the file specified :ERROR: $!". "\n";
Unary open works only on package (global) variables. This is documented on the manpage.
A better way to open a file for reading would be:
my $filename = $ARGV[0]; # store the 1st argument into the variable
open my $fh, '<', $filename or die $!; # open the file using lexically scoped filehandle
print <$fh>; # print file contents
P.S. always use strict and warnings while debugging your Perl scripts.
It's all in perldoc -f open:
If EXPR is omitted, the scalar variable of the same name as
the FILEHANDLE contains the filename. (Note that lexical
variables--those declared with "my"--will not work for this
purpose; so if you're using "my", specify EXPR in your call
to open.)
Note that this isn't a very good way to specify the file name. As you can see, it has a hard constraint on the variable type it's in, and either the global variable it requires or the global filehandle it opens are usually best avoided.
Using a lexical filehandle keeps its scope in control, and handles closing automatically:
open my $fh, '<', "filename" or die "string involving $!";
And if you're taking that file name from the command line, you could possibly do away with that open or any handle altogether, and use the plain <> operator to read from command-line arguments or STDIN. (see comments for more on this)
use strict;
use warnings;
my $file_name = shift #ARGV;
open(my $file, '<', $file_name) or die $!;
…
close($file);
Always use strict and warnings. If either of them complains, fix the code, do not comment out the pragmas. You can also use autodie to avoid the explicit or die after open, see autodie.
From Perl's docs for open()
If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE contains the filename. (Note that lexical variables--those declared with my--will not work for this purpose; so if you're using my, specify EXPR in your call to open.)