How to pass and read a File handle to a Perl subroutine? - perl

I want a Perl module that reads from the special file handle, <STDIN>, and passes this to a subroutine. You will understand what I mean when you see my code.
Here is how it was before:
#!/usr/bin/perl
use strict; use warnings;
use lib '/usr/local/custom_pm'
package Read_FH
sub read_file {
my ($filein) = #_;
open FILEIN, $filein or die "could not open $filein for read\n";
# reads each line of the file text one by one
while(<FILEIN>){
# do something
}
close FILEIN;
Right now the subroutine takes a file name (stored in $filein) as an argument, opens the file with a file handle, and reads each line of the file one by one using the fine handle.
Instead, I want get the file name from <STDIN>, store it inside a variable, then pass this variable into a subroutine as an argument.
From the main program:
$file = <STDIN>;
$variable = read_file($file);
The subroutine for the module is below:
#!/usr/bin/perl
use strict; use warnings;
use lib '/usr/local/custom_pm'
package Read_FH
# subroutine that parses the file
sub read_file {
my ($file)= #_;
# !!! Should I open $file here with a file handle? !!!!
# read each line of the file
while($file){
# do something
}
Does anyone know how I can do this? I appreciate any suggestions.

It is a good idea in general to use lexical filehandlers. That is a lexical variable containing the file handler instead of a bareword.
You can pass it around like any other variables. If you use read_file from File::Slurp you do not need a seperate file handler, it slurps the content into a variable.
As it is also good practice to close opened file handles as soon as possible this should be the preferred way if you realy only need to get the complete file content.
With File::Slurp:
use strict;
use warnings;
use autodie;
use File::Slurp;
sub my_slurp {
my ($fname) = #_;
my $content = read_file($fname);
print $content; # or do something else with $content
return 1;
}
my $filename = <STDIN>;
my_slurp($filename);
exit 0;
Without extra modules:
use strict;
use warnings;
use autodie;
sub my_handle {
my ($handle) = #_;
my $content = '';
## slurp mode
{
local $/;
$content = <$handle>
}
## or line wise
#while (my $line = <$handle>){
# $content .= $line;
#}
print $content; # or do something else with $content
return 1;
}
my $filename = <STDIN>;
open my $fh, '<', $filename;
my_handle($fh); # pass the handle around
close $fh;
exit 0;

I agree with #mugen kenichi, his solution is a better way to do it than building your own. It's often a good idea to use stuff the community has tested. Anyway, here are the changes you can make to your own program to make it do what you want.
#/usr/bin/perl
use strict; use warnings;
package Read_FH;
sub read_file {
my $filein = <STDIN>;
chomp $filein; # Remove the newline at the end
open my $fh, '<', $filein or die "could not open $filein for read\n";
# reads each line of the file text one by one
my $content = '';
while (<$fh>) {
# do something
$content .= $_;
}
close $fh;
return $content;
}
# This part only for illustration
package main;
print Read_FH::read_file();
If I run it, it looks like this:
simbabque#geektour:~/scratch$ cat test
this is a
testfile
with blank lines.
simbabque#geektour:~/scratch$ perl test.pl
test
this is a
testfile
with blank lines.

Related

How to open a file that has a special character in it such as $?

Seems fairly simple but with the "$" in the name causes the name to split. I tried escaping the character out but when I try to open the file I get GLOB().
my $path = 'C:\dir\name$.txt';
open my $file, '<', $path || die
print "file = $file\n";
It should open the file so I can traverse the entries.
It has nothing to do with the "$". Just follow standard file handling procedure.
use strict;
use warnings;
my $path = 'C:\dir\name$.txt';
open my $file_handle, '<', $path or die "Can't open $path: $!";
# read and print the file line by line
while (my $line = <$file_handle>) {
# the <> in scalar context gets one line from the file
print $line;
}
# reset the handle
seek $file_handle, 0, 0;
# read the whole file at once, print it
{
# enclose in a block to localize the $/
# $/ is the line separator, so when it's set to undef,
# it reads the whole file
local $/ = undef;
my $file_content = <$file_handle>;
print $file_content;
}
Consider using the CPAN modules File::Slurper or Path::Tiny which will handle the exact details of using open and readline, checking for errors, and encoding if appropriate (most text files are encoded to UTF-8).
use strict;
use warnings;
use File::Slurper 'read_text';
my $file_content = read_text $path;
use Path::Tiny 'path';
my $file_content = path($path)->slurp_utf8;
If it's a data file, use read_binary or slurp_raw.

Perl search file recursive for string and replace

in terminal with perl how can I search all php files starting recursive from current working directory for a single or multiline pattern like:
<script>var a=''; * hamoorabi.com * </script>
Read like: find all between <script>var a=''; and </script> but only if contains hamoorabi.com and replace it with an empty string (remove it).
As it´s javascript code there can be a bunch of unescaped characters inside the search string.
From a unix or cwygin prompt:
$ find . | grep .php | xargs ./xx1.pl
Where perl script xx1.pl is :
#!/usr/bin/perl
use strict;
use warnings;
undef $/;
for (#ARGV) {
open(FILE,$_);
my $content = <FILE>;
close(FILE);
my $beginning = '<script>var a=\'\'';
my $end = '</script>';
my $containing = 'hamoorabi.com';
#$content =~ s/(<script>.*?)(hamoorabi.com)(.*?<\/script>)/$1$3/sg;
#much better regex provided by ysth
$content =~ s/\Q$beginning\E(?:(?!\Q$end\E).)*\Q$containing\E.*?\Q$end\E//gs;
open(FILE,">$_");
print FILE $content;
close(FILE);
}
Use File::Find to walk a directory tree looking for files.
use strict;
use warnings;
use File::Find;
sub wanted {
# Ignore anything that isn't a file
return unless -f;
# Ignore anything without a .php extension
return unless /\.php$/;
# Your filename is in $_. Your current directory is the
# one which contains the current file.
# Do what you need to do.
open my $fh, '<', $_ or die $!;
...;
}
I hope this is also helps use further more.
use strict;
use warnings;
String replacement Regex forms
my $str = "<script>var a=''\; * hamoorabi.com * </script>";
$str=~s#<script[^>]*>((?:(?!<\/script>).)*)<\/script># my $script=$&;
$script=~s/^(.*)hamoorabi\.com(.*)$/$1$2/g;
($script);
#esg;
print $str;
Replace the content on the same file using Tie::File
my #array;
use Tie::File;
tie #array, 'Tie::File', "Input.php" || die "blabla";
my $len = join "\n", #array;
#array = split/\n/, $len;
untie #array;

My perl script isn't working, I have a feeling it's the grep command

I'm trying for search in the one file for instances of the
number and post if the other file contains those numbers
#!/usr/bin/perl
open(file, "textIds.txt"); #
#file = <file>; #file looking into
# close file; #
while(<>){
$temp = $_;
$temp =~ tr/|/\t/; #puts tab between name and id
#arrayTemp = split("\t", $temp);
#found=grep{/$arrayTemp[1]/} <file>;
if (defined $found[0]){
#if (grep{/$arrayTemp[1]/} <file>){
print $_;
}
#found=();
}
print "\n";
close file;
#the input file lines have the format of
#John|7791 154
#Smith|5432 290
#Conor|6590 897
#And in the file the format is
#5432
#7791
#6590
#23140
There are some issues in your script.
Always include use strict; and use warnings;.
This would have told you about odd things in your script in advance.
Never use barewords as filehandles as they are global identifiers. Use three-parameter-open
instead: open( my $fh, '<', 'testIds.txt');
use autodie; or check whether the opening worked.
You read and store testIds.txt into the array #file but later on (in your grep) you are
again trying to read from that file(handle) (with <file>). As #PaulL said, this will always
give undef (false) because the file was already read.
Replacing | with tabs and then splitting at tabs is not neccessary. You can split at the
tabs and pipes at the same time as well (assuming "John|7791 154" is really "John|7791\t154").
Your talking about "input file" and "in file" without exactly telling which is which.
I assume your "textIds.txt" is the one with only the numbers and the other input file is the
one read from STDIN (with the |'s in it).
With this in mind your script could be written as:
#!/usr/bin/perl
use strict;
use warnings;
# Open 'textIds.txt' and slurp it into the array #file:
open( my $fh, '<', 'textIds.txt') or die "cannot open file: $!\n";
my #file = <$fh>;
close($fh);
# iterate over STDIN and compare with lines from 'textIds.txt':
while( my $line = <>) {
# split "John|7791\t154" into ("John", "7791", "154"):
my ($name, $number1, $number2) = split(/\||\t/, $line);
# compare $number1 to each member of #file and print if found:
if ( grep( /$number1/, #file) ) {
print $line;
}
}

foreach and special variable $_ not behaving as expected

I'm learning Perl and wrote a small script to open perl files and remove the comments
# Will remove this comment
my $name = ""; # Will not remove this comment
#!/usr/bin/perl -w <- wont remove this special comment
The name of files to be edited are passed as arguments via terminal
die "You need to a give atleast one file-name as an arguement\n" unless (#ARGV);
foreach (#ARGV) {
$^I = "";
(-w && open FILE, $_) || die "Oops: $!";
/^\s*#[^!]/ || print while(<>);
close FILE;
print "Done! Please see file: $_\n";
}
Now when I ran it via Terminal:
perl removeComments file1.pl file2.pl file3.pl
I got the output:
Done! Please see file:
This script is working EXACTLY as I'm expecting but
Issue 1 : Why $_ didn't print the name of the file?
Issue 2 : Since the loop runs for 3 times, why Done! Please see file: was printed only once?
How you would write this script in as few lines as possible?
Please comment on my code as well, if you have time.
Thank you.
The while stores the lines read by the diamond operator <> into $_, so you're writing over the variable that stores the file name.
On the other hand, you open the file with open but don't actually use the handle to read; it uses the empty diamond operator instead. The empty diamond operator makes an implicit loop over files in #ARGV, removing file names as it goes, so the foreach runs only once.
To fix the second issue you could use while(<FILE>), or rewrite the loop to take advantage of the implicit loop in <> and write the entire program as:
$^I = "";
/^\s*#[^!]/ || print while(<>);
Here's a more readable approach.
#!/usr/bin/perl
# always!!
use warnings;
use strict;
use autodie;
use File::Copy;
# die with some usage message
die "usage: $0 [ files ]\n" if #ARGV < 1;
for my $filename (#ARGV) {
# create tmp file name that we are going to write to
my $new_filename = "$filename\.new";
# open $filename for reading and $new_filename for writing
open my $fh, "<", $filename;
open my $new_fh, ">", $new_filename;
# Iterate over each line in the original file: $filename,
# if our regex matches, we bail out. Otherwise we print the line to
# our temporary file.
while(my $line = <$fh>) {
next if $line =~ /^\s*#[^!]/;
print $new_fh $line;
}
close $fh;
close $new_fh;
# use File::Copy's move function to rename our files.
move($filename, "$filename\.bak");
move($new_filename, $filename);
print "Done! Please see file: $filename\n";
}
Sample output:
$ ./test.pl a.pl b.pl
Done! Please see file: a.pl
Done! Please see file: b.pl
$ cat a.pl
#!/usr/bin/perl
print "I don't do much\n"; # comments dont' belong here anyways
exit;
print "errrrrr";
$ cat a.pl.bak
#!/usr/bin/perl
# this doesn't do much
print "I don't do much\n"; # comments dont' belong here anyways
exit;
print "errrrrr";
Its not safe to use multiple loops and try to get the right $_. The while Loop is killing your $_. Try to give your files specific names inside that loop. You can do this with so:
foreach my $filename(#ARGV) {
$^I = "";
(-w && open my $FILE,'<', $filename) || die "Oops: $!";
/^\s*#[^!]/ || print while(<$FILE>);
close FILE;
print "Done! Please see file: $filename\n";
}
or that way:
foreach (#ARGV) {
my $filename = $_;
$^I = "";
(-w && open my $FILE,'<', $filename) || die "Oops: $!";
/^\s*#[^!]/ || print while(<$FILE>);
close FILE;
print "Done! Please see file: $filename\n";
}
Please never use barewords for filehandles and do use a 3-argument open.
open my $FILE, '<', $filename — good
open FILE $filename — bad
Simpler solution: Don't use $_.
When Perl was first written, it was conceived as a replacement for Awk and shell, and Perl heavily borrowed from that syntax. Perl also for readability created the special variable $_ which allowed you to use various commands without having to create variables:
while ( <INPUT> ) {
next if /foo/;
print OUTPUT;
}
The problem is that if everything is using $_, then everything will effact $_ in many unpleasant side effects.
Now, Perl is a much more sophisticated language, and has things like locally scoped variables (hint: You don't use local to create these variables -- that merely gives _package variables (aka global variables) a local value.)
Since you're learning Perl, you might as well learn Perl correctly. The problem is that there are too many books that are still based on Perl 3.x. Find a book or web page that incorporates modern practice.
In your program, $_ switches from the file name to the line in the file and back to the next file. It's what's confusing you. If you used named variables, you could distinguished between files and lines.
I've rewritten your program using more modern syntax, but your same logic:
use strict;
use warnings;
use autodie;
use feature qw(say);
if ( not $ARGV[0] ) {
die "You need to give at least one file name as an argument\n";
}
for my $file ( #ARGV ) {
# Remove suffix and copy file over
if ( $file =~ /\..+?$/ ) {
die qq(File "$file" doesn't have a suffix);
}
my ( $output_file = $file ) =~ s/\..+?$/./; #Remove suffix for output
open my $input_fh, "<", $file;
open my $output_fh, ">", $output_file;
while ( my $line = <$input_fh> ) {
print {$output_fh} $line unless /^\s*#[^!]/;
}
close $input_fh;
close $output_fh;
}
This is a bit more typing than your version of the program, but it's easier to see what's going on and maintain.

how print out the output in a new file perl

How can I print in a directory the output of the variable $newFile ? How can I use 'cp' to do that ?
After modifications, my code looks like this :
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use File::Copy 'cp';
# binmode(STDOUT, ":utf8") ;
# warn Dumper \#repertoire;
my #rep= glob('/home/test/Bureau/Perl/Test/*'); # output to copy in this dir
foreach my $file (#rep)
{
open(IN, $file) or die "Can't read file '$file' [$!]\n";
while (<IN>)
{
my ($firstCol, $secondCol) = split(/","/, $_);
$firstCol =~ s/http:\/\//_/g;
$secondCol =~ s/\(.+\)/ /ig;
my $LCsecondCol = lc($secondCol);
chomp($secondCol);
chomp($LCsecondCol);
my $newFile = "$firstCol:($secondCol|$LCsecondCol);";
$newFile =~ s/=//g;
print "$newFile\n";
}
close(IN);
}
Your program is a long way off even compiling. You should pay attention to these details
With use strict in place, as it should be, you must declare all of your variables at their point of first use. The variables #files, $file, and $newFile are undeclared so your program won't compile
glob in scalar context returns the next file name that matches the pattern, and is meant for use in a while loop. To get all of the files that match the pattern you should assign to an array, and from the commented-out warn statement it looks like your code used to be that way
You should use lexical file handles and the three-parameter form of open. Well done for checking the status of the open and putting $! in your die string
Your $file =~ ... line looks like it should be a substitution, and the parenthesis at the end should be a semicolon
You have used File::Copy but then use system to copy your files. You should avoid shelling out wherever convenient, and since File::Copy provides a cp function you should use it
Something closer to a working version of your code would look like this
use strict;
use warnings;
use File::Copy 'cp';
while (my $fileName = glob '/home/test/Bureau/Infobox/*.csv') {
my #files = do {
open my $in, '<', $fileName or die "Can't read file '$fileName' [$!]\n";
print "$fileName\n" ;
<$in>;
};
foreach my $file (#files) {
my $newFile = $file =~ s/(\x{0625}\x{0646}\b.+?)\./[[ ]]/gr;
cp $file, $newFile;
}
}