I'm trying to read from two files, and generate output in a third. I first wanted to edit the first one on the go but I didn't find a suitable method save for arrays.
My problem is that the third file (output) is empty whenever I uncomment the "_ref_param_handling" function. BUT the following is what puzzles me the most: If I do a UNIX very basic `cat` system call on the output file at then end (see code below), it works just fine. If I open the filehandle just before and close it right after editing, it also works fine (around my print FILEHANDLE LIST).
I undoubtedly am missing something here. Apart from a problem between my keyboard and my chair, what is it? A filehandle conflict? A scope problem?
Every variable is declared and has the value I want it to have.
Edit (not applicable anymore).
Using IO::File on the three files didn't change anything.
Edit 2 : New full subroutine code
My code works (except when my ref already exists, but that's because of the "append" mode i think) but there might be some mistakes and unperlish ways of coding (sorry, Monks). I, however, use Strict and warnings !
sub _ref_edit($) {
my $manda_def = "$dir/manda_def.list";
my $newrefhandle;
my $ref = $_[0];
(my $refout = $ref) =~ s/empty//;
my $refhandle;
my $parname = '';
my $parvalue = '';
my #val;
_printMan;
my $flush = readline STDIN; # Wait for <enter>
# If one or both of the ref. and the default values are missing
if ( !( -e $manda_def && -e $ref ) ) {
die "Cannot find $ref and/or $manda_def";
}
# Open needed files (ref & default)
open( $refhandle, "<", $ref ) or die "Cannot open ref $ref : $!";
open( $newrefhandle, ">>", $refout )
or die "Cannot open new ref $refout : $!";
# Read each line
while ( my $refline = <$refhandle> ) {
# If line read not an editable macro
if ( $refline =~ /^define\({{(.+)}},\s+{{.*__VALUE__.*}}\)/ ){
$parname = $1; # $1 = parameter name captured in regexp
# Prompt user
$parvalue = _ref_param_handling( $parname, $manda_def );
# Substitution in ref
$refline =~ s/__VALUE__/$parvalue/;
# Param not specified and no default value
$parvalue eq '' ? $refline=~s/__COM__/#/ : $refline=~s/__COM__//;
}
print $newrefhandle $refline;
}
close $newrefhandle;
close $refhandle;
return $refout;
} # End ref edit
the _ref_param_handle subroutine still is :
open( $mde, '<', $_[1] )
or die "Cannot open mandatory/default list $_[1] : $!";
# Read default/mandatory file list
while (<$mde>) {
( $name, $manda, $default, $match, $descript ) = split( /\s+/, $_, 5 );
next if ( $name !~ $ref_param ); # If param read differs from parname
(SOME IF/ELSE)
} # End while <MDE>
close $mde;
return $input;
}
Extract from manda_def file :
NAME Mandatory? Default Match Comm.
PORT y NULL ^\d+$ Database port
PROJECT y NULL \w{1,5} Project name
SERVER y NULL \w+ Server name
modemRouting n NULL .+
modlib y bin .+
modules y sms .+
Extract from ref_file :
define({{PORT}}, {{__VALUE__}})dnl
define({{PROJECT}}, {{__VALUE__}})dnl
define({{SERVER}}, {{__VALUE__}})dnl
define({{modemRouting}}, {{__COM__{{$0}} '__VALUE__'}})dnl
define({{modlib}}, {{__COM__{{$0}} '__VALUE__'}})dnl
define({{modules}}, {{__COM__{{$0}} '__VALUE__'}})dnl
Any help appreciated.
It is unclear what is initialising $refhandle, $newrefhandle and $mde. Depending on the values they have will affect the behaviour of open - i.e. whether it will close any filehandles before opening a new one.
I would suggest that you start using the IO::File interface to open/write to files, as this makes the job of filehandle management much easier, and will avoid any inadvertent closes. Something like...
use IO::File;
my $refhandle = IO::File->new("< $ref") or die "open() - $!";
$refhandle->print(...);
As far as editing files in place goes, this is a common pattern I use to achieve this, make sure of the -i behaviour of perl.
sub edit_file
{
my ($filename) = #_;
# you can re-create the one-liner above by localizing #ARGV as the list of
# files the <> will process, and localizing $^I as the name of the backup file.
local (#ARGV) = ($filename);
local($^I) = '.bak';
while (<>)
{
s/original string/new string/g;
}
continue
{
print;
}
}
try opening the second file handle for input outside the loop and pass a reference to the subroutine _ref_param_handle.Use seek function to seek file back to start.If your file is not too large you can also think of storing the content in an array and the accessing it instead of looping over same contents.
EDIT:
Here is a small example to support what I was trying to say above:
#!/usr/bin/perl -w
sub test
{
my $fh_to_read = $_[0] ;
my $fh_to_write = $_[1] ;
while(<$fh_to_read>)
{
print $fh_to_write $_ ;
}
seek($fh_to_read,0,0) ;
}
open(FH1,"<dummy1");
open(FH2,"<dummy2");
open(FH3,">dummy3");
while(<FH2>)
{
print FH3 "$_" ;
test(\*FH1,\*FH3);
}
Info about perl references
From what I gather, your script wants to convert a file in the following form:
define({{VAR1}}, {{__VALUE__}})
define({{VAR2}}, {{__VALUE__}})
define({{VAR3}}, {{__VALUE__}})
define({{VAR4}}, {{__VALUE__}})
to something like this:
define({{VAR1}}, {{}})
define({{VAR2}}, {{VALUE2}})
define({{VAR3}}, {{VALUE3}})
define({{VAR4}}, {{}})
The following works. I don't know what manda_def means, and also I didn't bother to create an actual variable replacement function.
#!/usr/bin/perl
use strict;
use warnings;
sub work {
my ($ref, $newref, $manda_def) = #_;
# Open needed files (ref & default)
open(my $refhandle, '<', $ref) or die "Cannot open ref $ref : $!";
open(my $newrefhandle, '>', $newref) or die "Cannot open new ref $newref: $!";
# Read each line
while (my $refline = <$refhandle>) {
# if line read is not an editable macro
if ($refline =~ /^define\({{(.+)}},\s+{{.*__VALUE__.*}}\)/){
my $parvalue = _ref_param_handling($1, $manda_def); # manda_def?
# Substitution in ref
$refline =~ s/__VALUE__/$parvalue/;
# Param not specified and no default value
$refline =~ s/__COM__/#/ if $parvalue eq '';
}
print $newrefhandle $refline;
}
close $newrefhandle;
close $refhandle;
return $newref;
}
sub _ref_param_handling {
my %parms = (VAR2 => 'VALUE2', VAR3 => 'VALUE3');
return $parms{$_[0]} if exists $parms{$_[0]};
}
work('ref.txt', 'newref.txt', 'manda.txt');
Guys, I seriously consider hanging myself with my wireless mouse.
My script never failed. I just didn't ran it through the end (it's actually a very long parameter list). The printing is just done as soon as the filehandle is closed (or so I guessed)...
/me *cries*
I've spent 24 hours on this...
Related
I have a log file content many blocks /begin CHECK ... /end CHECK like below:
/begin CHECK
Var_AAA
"Description AAA"
DATATYPE UBYTE
Max_Value 255.
ADDRESS 0xFF0011
/end CHECK
/begin CHECK
Var_BBB
"Description BBB"
DATATYPE UBYTE
Max_Value 255.
ADDRESS 0xFF0022
/end CHECK
...
I want to extract the variable name and its address, then write to a new file like this
Name Address
Var_AAA => 0xFF0011
Var_BBB => 0xFF0022
I am just thinking about the ($start, $keyword, $end) to check for each block and extract data after keyword only
#!/usr/bin/perl
use strict;
use warnings;
my $input = 'input.log';
my $output = 'output.out';
my ( $start, $keyword, $end ) = ( '^\/begin CHECK\n\n', 'ADDRESS ', '\/end CHECK' );
my #block;
# open input file for reading
open( my $in, '<', $input ) or die "Cannot open file '$input' for reading: $!";
# open destination file for writing
open( my $out, '>', $output ) or die "Cannot open file '$output' for writing: $!";
print( "copying variable name and it's address from $input to $output \n" );
while ( $in ) { #For each line of input
if ( /$start/i .. /$end/i ) { #Block matching
push #block, $_;
}
if ( /$end/i ) {
for ( #block ) {
if ( /\s+ $keyword/ ) {
print $out join( '', #block );
last;
}
}
#block = ();
}
close $in or die "Cannot close file '$input': $!";
}
close $out or die "Cannot close file '$output': $!";
But I got nothing after execution. Can anyone suggest me with sample idea?
Most everything looks good but it's your start regex that's causing the first problem:
'^\/begin CHECK\n\n'
You are reading lines from the file but then looking for two newlines in a row. That's not going to ever match because a line ends with exactly one newline (unless you change $/, but that's a different topic). If you want to match the send of a line, you can use the $ (or \z) anchor:
'^\/begin CHECK$'
Here's the program I pared down. You can adjust it to do all the rest of the stuff that you need to do:
use v5.10;
use strict;
use warnings;
use Data::Dumper;
my ($start, $keyword, $end) = (qr{^/begin CHECK$}, qr(^ADDRESS ), qr(^/end CHECK));
while (<DATA>) #For each line of input
{
state #block;
chomp;
if (/$start/i .. /$end/i) #Block matching
{
push #block, $_ unless /^\s*$/;
}
if( /$end/i )
{
print Dumper( \#block );
#block = ();
}
}
After that, you're not reading the data. You need to put the filehandle inside <> (the line input operator):
while ( <$in> )
The file handles will close themselves at the end of the program automatically. If you want to close them yourself that's fine but don't do that until you are done. Don't close $in until the while is finished.
using the command prompt in windows. In MacOS or Unix will follow the same logic you can do:
perl -wpe "$/='/end CHECK';s/^.*?(Var_\S+).*?(ADDRESS \S+).*$/$1 => $2\n/s" "your_file.txt">"new.txt
first we set the endLine character to $/ = "/end CHECK".
we then pick only the first Var_ and the first ADDRESS. while deleting everything else in single line mode ie Dot Matches line breaks \n. s/^.*?(Var_\S+).*?(ADDRESS \S+).*$/$1 => $2\n/s.
We then write the results into a new file. ie >newfile.
Ensure to use -w -p -e where -e is for executing the code, -p is for printing and -w is for warnings:
In this code, I did not write the values to a new file ie, did not include the >newfile.txt prt so that you may be able to see the result. If you do include the part, just open the newfile.txt and everything will be printed there
Here are some of the issues with your code
You have while ($in) instead of while ( <$in> ), so your program never reads from the input file
You close your input file handle inside the while read loop, so you can only ever read one record
Your $start regex pattern is '^\/begin CHECK\n\n'. The single quotes make your program search for backslash n backslash n instead of newline newline
Your test if (/\s+ $keyword/) looks for multiple space characters of any sort, followed by a space, followed by ADDRESS—the contents of $keyword. There are no occurrences of ADDRESS preceded by whitespace anywhere in your data
You have also written far too much without testing anything. You should start by writing your read loop on its own and make sure that the data is coming in correctly before proceeding by adding two or three lines of code at a time between tests. Writing 90% of the functionality before testing is a very bad approach.
In future, to help you address problems like this, I would point you to the excellent resources linked on the Stack Overflow Perl tag information page
The only slightly obscure thing here is that the range operator /$start/i .. /$end/i returns a useful value; I have copied it into $status. The first time the operator matches, the result will be 1; the second time 2 etc. The last time is different because it is a string that uses engineering notation like 9E0, so it still evaluates to the correct count but you can check for the last match using /E/. I've used == 1 and /E/ to avoid pushing the begin and end lines onto #block
I don't think there's anything else overly complex here that you can't find described in the Perl language reference
use strict;
use warnings;
use autodie; # Handle bad IO status automatically
use List::Util 'max';
my ($input, $output) = qw/ input.log output.txt /;
open my $in_fh, '<', $input;
my ( #block, #vars );
while ( <$in_fh> ) {
my $status = m{^/begin CHECK}i .. m{^/end CHECK}i;
if ( $status =~ /E/ ) { # End line
#block = grep /\S/, #block;
chomp #block;
my $var = $block[0];
my $addr;
for ( #block ) {
if ( /^ADDRESS\s+(0x\w+)/ ) {
$addr = $1;
last;
}
}
push #vars, [ $var, $addr ];
#block = ();
}
elsif ( $status ) {
push #block, $_ unless $status == 1;
}
}
# Format and generate the output
open my $out_fh, '>', $output;
my $w = max map { length $_->[0] } #vars;
printf $out_fh "%-*s => %s\n", $w, #$_ for [qw/ Name Address / ], #vars;
close $out_fh;
output
Name => Address
Var_AAA => 0xFF0011
Var_BBB => 0xFF0022
Update
For what it's worth, I would have written something like this. It produces the same output as above
use strict;
use warnings;
use autodie; # Handle bad IO status automatically
use List::Util 'max';
my ($input, $output) = qw/ input.log output.txt /;
my $data = do {
open my $in_fh, '<', $input;
local $/;
<$in_fh>;
};
my #vars;
while ( $data =~ m{^/begin CHECK$(.+?)^/end CHECK$}gms ) {
my $block = $1;
next unless $block =~ m{(\w+).+?ADDRESS\s+(0x\w+)}ms;
push #vars, [ $1, $2 ];
}
open my $out_fh, '>', $output;
my $w = max map { length $_->[0] } #vars;
printf $out_fh "%-*s => %s\n", $w, #$_ for [qw/ Name Address / ], #vars;
close $out_fh;
I have a text file which lists a service, device and a filter, here I list 3 examples only:
service1 device04 filter9
service2 device01 filter2
service2 device10 filter11
I have written a perl script that iterates through the file and should then print device=device filter=filter to a file named according to the service it belongs to, but if a string contains a duplicate filter, it should add the devices to the same file, seperated by semicolons. Looking at the above example, I then need a result of:
service1.txt
device=device04 filter=filter9
service2.txt
device=device01 filter=filter2 ; device=device10 filter=filter11
Here is my code:
use strict;
use warnings qw(all);
open INPUT, "<", "file.txt" or die $!;
my #Input = <INPUT>;
foreach my $item(#Input) {
my ($serv, $device, $filter) = split(/ /, $item);
chomp ($serv, $device, $filter);
push my #arr, "device==$device & filter==$filter";
open OUTPUT, ">>", "$serv.txt" or die $!;
print OUTPUT join(" ; ", #arr);
close OUTPUT;
}
The problem I am having is that both service1.txt and service2.txt are created, but my results are all wrong, see my current result:
service1.txt
device==device04 filter==filter9
service2.txt
device==device04 filter==filter9 ; device==device01 filter==filter2device==device04 filter==filter9 ; device==device01 filter==filter2 ; device==device10 filter==filter11
I apologise, I know this is something stupid, but it has been a really long night and my brain cannot function properly I believe.
For each service to have its own file where data for it accumulates you need to distinguish for each line what file to print it to.
Then open a new service-file when a service without one is encountered, feasible since there aren't so many as clarified in a comment. This can be organized by a hash service => filehandle.
use warnings;
use strict;
use feature 'say';
my $file = shift #ARGV || 'data.txt';
my %handle;
open my $fh, '<', $file or die "Can't open $file: $!";
while (<$fh>) {
my ($serv, $device, $filter) = split;
if (exists $handle{$serv}) {
print { $handle{$serv} } " ; device==$device & filter==$filter";
}
else {
open my $fh_out, '>', "$serv.txt" or do {
warn "Can't open $serv.txt: $!";
next;
};
print $fh_out "device==$device & filter==$filter";
$handle{$serv} = $fh_out;
}
}
say $_ '' for values %handle; # terminate the line in each file
close $_ for values %handle;
For clarity the code prints almost the same in both cases, what surely can be made cleaner. This was tested only with the provided sample data and produces the desired output.
Note that when a filehandle need be evaluated we need { }. See this post, for example.
Comments on the original code (addressed in the code above)
Use lexical filehandles (my $fh) instead of typeglobs (FH)
Don't read the whole file at once unless there is a specific reason for that
split has nice defaults, split ' ', $_, where ' ' splits on whitespace and discards leading and trailing space as well. (And then there is no need to chomp in this case.)
Another option is to first collect data for each service, just as OP attempts, but again use a hash (service => arrayref/string with data) and print at the end. But I don't see a reason to not print as you go, since you'd need the same logic to decide when ; need be added.
Your code looks pretty perl4-ish, but that's not a problem. As MrTux has pointed out, you are confusing collection and fanning out of your data. I have refactored this to use a hash as intermediate container with the service name as keys. Please note that this will not accumulate results across mutliple calls (as it uses ">" and not ">>").
use strict;
use warnings qw(all);
use File::Slurp qw/read_file/;
my #Input = read_file('file.txt', chomp => 1);
my %store = (); # Global container
# Capture
foreach my $item(#Input) {
my ($serv, $device, $filter) = split(/ /, $item);
push #{$store{$serv}}, "device==$device & filter==$filter";
}
# Write out for each service file
foreach my $k(keys %store) {
open(my $OUTPUT, ">", "$k.txt") or die $!;
print $OUTPUT join(" ; ", #{$store{$k}});
close( $OUTPUT );
}
I am a beginner with Perl and I want to merge the content of two text files.
I have read some similar questions and answers on this forum, but I still cannot resolve my issues
The first file has the original ID and the recoded ID of each individual (in the first and fourth columns)
The second file has the recoded ID and some information on some of the individuals (in the first and second columns).
I want to create an output file with the original, recoded and information of these individuals.
This is the perl script I have created so far, which is not working.
If anyone could help it would be very much appreciated.
use warnings;
use strict;
use diagnostics;
use vars qw( #fields1 $recoded $original $IDF #fields2);
my %columns1;
open (FILE1, "<file1.txt") || die "$!\n Couldn't open file1.txt\n";
while ($_ = <FILE1>)
{
chomp;
#fields1=split /\s+/, $_;
my $recoded = $fields1[0];
my $original = $fields1[3];
my %columns1 = (
$recoded => $original
);
};
open (FILE2, "<file2.txt") || die "$!\n Couldnt open file2.txt \n";
for ($_ = <FILE2>)
{
chomp;
#fields2=split /\s+/, $_;
my $IDF= $fields2[0];
my $F=$fields2[1];
my %columns2 = (
$F => $IDF
);
};
close FILE1;
close FILE2;
open (FILE3, ">output.txt") ||die "output problem\n";
for (keys %columns1) {
if (exists ($columns2{$_}){
print FILE3 "$_ $columns1{$_}\n"
};
}
close FILE3;
One problem is with scoping. In your first loop, you have a my in front of $column1 which makes it local to the loop and will not be in scope when you next the loop. So the %columns1 (which is outside of the loop) does not have any values set (which is what I suspect you want to set). For the assignment, it would seem to be easier to have $columns1{$recorded} = $original; which assigns the value to the key for the hash.
In the second loop you need to declare %columns2 outside of the loop and possibly use the above assignment.
For the third loop, in the print you just need add $columns2{$_} in front part of the string to be printed to get the original ID to be printed before the recorded ID.
Scope:
The problem is with scope of the hash variables you have defined. The scope of the variable is limited to the loop inside which the variable has been defined.
In your code, since %columns1 and %columns2 are used outside the while loops. Hence, they should be defined outside the loops.
Compilation error : braces not closed properly
Also, in the "if exists" part, the open-and-closed braces symmetry is affected.
Here is your code with the required corrections made:
use warnings;
use strict;
use diagnostics;
use vars qw( #fields1 $recoded $original $IDF #fields2);
my (%columns1, %columns2);
open (FILE1, "<file1.txt") || die "$!\n Couldn't open CFC_recoded.txt\n";
while ($_ = <FILE1>)
{
chomp;
#fields1=split /\s+/, $_;
my $recoded = $fields1[0];
my $original = $fields1[3];
%columns1 = (
$recoded => $original
);
}
open (FILE2, "<file2.txt") || die "$!\n Couldnt open CFC_F.xlsx \n";
for ($_ = <FILE2>)
{
chomp;
#fields2=split /\s+/, $_;
my $IDF= $fields2[0];
my $F=$fields2[1];
%columns2 = (
$F => $IDF
);
}
close FILE1;
close FILE2;
open (FILE3, ">output.txt") ||die "output problem\n";
for (keys %columns1) {
print FILE3 "$_ $columns1{$_} \n" if exists $columns2{$_};
}
close FILE3;
I wrote a perl script to count the occurrences of a character in a file.
So far this is what I have got,
#!/usr/bin/perl -w
use warnings;
no warnings ('uninitialized', 'substr');
my $lines_ref;
my #lines;
my $count;
sub countModule()
{
my $file = "/test";
open my $fh, "<",$file or die "could not open $file: $!";
my #contents = $fh;
my #filtered = grep (/\// ,#contents);
return \#filtered;
}
#lines = countModule();
##lines = $lines_ref;
$count = #lines;
print "###########\n $count \n###########\n";
My test file looks like this:
10.0.0.1/24
192.168.10.0/24
172.16.30.1/24
I am basically trying to count the number of instances of "/"
This is the output that I get:
###########
1
###########
I am getting 1 instead of 3, which is the number of occurrences.
Still learning perl, so any help will be appreciated..Thank you!!
Here are a few points about your code
You should always use strict at the top of your program, and only use no warnings for special reasons in a limited scope. There is no general reason why a working Perl program should need to disable warnings globally
Declare your variables close to their first point of use. The style of declaring everything at the top of the file is unnecessary and is a legacy of C
Never use prototypes in your code. They are available for very special purposes and shouldn't be used for the vast majority of Perl code. sub countModule() { ... } insists that countModule may never be called with any parameters and isn't necessary or useful. The definition should be just sub countModule { ... }
A big well done! for using a lexical file handle, the three-parameter form of open, and putting $! in your die string
my #contents = $fh will just set #contents to a single-element list containing just the filehandle. To read the whole file into the array you need my #contents = <$fh>
You can avoid escaping slashes in a regular expression if you use a different delimiter. To do that you need to use the m operator explicitly, like my #filtered = grep m|/|, #contents)
You return an array reference but assign the returned value to an array, so #lines = countModule() sets #lines to a single-element list containing just the array reference. You should either return a list with return #filtered or dereference the return value on assignment with #lines = #{ countModule }
If all you need to do is to print the number of lines in the file that contain a slash character then you could write something like this
use strict;
use warnings;
my $count;
sub countModule {
open my $fh, '<', '/test' or die "Could not open $file: $!";
return [ grep m|/|, <$fh> ];
}
my $lines = countModule;
$count = #$lines;
print "###########\n $count \n###########\n";
Close, but a few issues:
use strict;
use warnings;
sub countModule
{
my $file = "/test";
open my $fh, "<",$file or die "could not open $file: $!";
my #contents = <$fh>; # The <> brackets are used to read from $fh.
my #filtered = grep (/\// ,#contents);
return #filtered; # Remove the reference.
}
my #lines = countModule();
my $count = scalar #lines; # 'scalar' is not required, but lends clarity.
print "###########\n $count \n###########\n";
Each of the changes I made to your code are annotated with a #comment explaining what was done.
Now in list context your subroutine will return the filtered lines. In scalar context it will return a count of how many lines were filtered.
You did also mention find the occurrences of a character (despite everything in your script being line-oriented). Perhaps your counter sub would look like this:
sub file_tallies{
my $file = '/test';
open my $fh, '<', $file or die $!;
my $count;
my $lines;
while( <$fh> ) {
$lines++;
$count += $_ =~ tr[\/][\/];
}
return ( $lines, $count );
}
my( $line_count, $slash_count ) = file_tallies();
In list context,
return \#filtered;
returns a list with one element -- a reference to the named array #filtered. Maybe you wanted to return the list itself
return #filtered;
Here's some simpler code:
sub countMatches {
my ($file, $c) = #_; # Pass parameters
local $/;
undef $/; # Slurp input
open my $fh, "<",$file or die "could not open $file: $!";
my $s = <$fh>; # The <> brackets are used to read from $fh.
close $fh;
my $ptn = quotemeta($c); # So we can match strings like ".*" verbatim
my #hits = $s =~ m/($ptn)/g;
0 + #hits
}
print countMatches ("/test", '/') . "\n";
The code pushes Perl beyond the very basics, but not too much. Salient points:
By undeffing $/ you can read the input into one string. If you're counting
occurrences of a string in a file, and not occurrences of lines that contain
the string, this is usually easier to do.
m/(...)/g will find all the hits, but if you want to count strings like
"." you need to quote the meta characters in them.
Store the results in an array to evaluate m// in list context
Adding 0 to a list gives the number of items in it.
I am writing a comparefiles subroutine in Perl that reads a line of text from one file (f1) and then searches for it in another (f2) in the normal O(n^2) way.
sub comparefiles {
my($f1, $f2) = #_;
while(<f1>) {
# reset f2 to the beginning of the file
while(<f2>) {
}
}
}
sub someother {
open (one, "<one.out");
open (two, "<two.out");
&comparefiles(&one, &two);
}
I have two questions
How do I pass the file handles to the subroutine? In the above code, I have used them as scalars. Is that the correct way?
How do I reset the file pointer f2 to the beginning of the file at the position marked in the comment above?
First of all start every time your script with :
use strict;
use warnings;
Use lexical filehandle, three args open and test the result :
open my $fh1 , '<' , $filename1 or die "can't open '$filename1' for reading : $!";
Then you can pass the filehandles to the sub :
comparefiles($fh1, $fh2);
To rewind the file use the seek function (perldoc -f seek)
seek $fh, 0, 0;
If the files are small enough to fit in memory, you might consider storing the lines in a hash, which would prevent the need for O(n^2) searching.
Within the framework of your existing approach, I would advise against nesting your file reading loops -- perhaps on aesthetic grounds if nothing else. Instead, put the inner loop in a subroutine.
use strict;
use warnings;
# Works for 2 or more files.
analyze_files(#ARGV);
sub analyze_files {
my #file_names = #_;
my #handles = map { open my $h, '<', $_; $h } #_;
my $fh = shift #handles;
while (my $line = <$fh>) {
my #line_numbers = map { find_in_file($_, $line) } #handles;
print join("\t", #line_numbers, $line);
}
}
# Takes a file handle and a line to hunt for.
# Returns line number if the line is found.
sub find_in_file {
my ($fh, $find_this) = #_;
seek $fh, 0, 0;
while (my $line = <$fh>){
return $. if $line eq $find_this;
}
return -1; # Not found.
}