Perl editting a file - perl

I'm trying to open a file, search for a specific string in the file to begin my search from and then performing a replacement on a string later on in the file. For example, my file looks like:
Test Old
Hello World
Old
Data
Begin_search_here
New Data
Old Data
New Data
I want to open the file, begin my search from "Begin_search_here" and then replace the next instance of the word "Old" with "New". My code is shown below and I'm correctly finding the string, but for some reason I'm not writing in the correct location.
open(FILE, "+<$filename") || die "problem opening file";
my search = 0;
while(my $line = <FILE>)
{
if($line =~ m/Begin_search_here/)
{
$search = 1;
}
if($search == 1 && $line =~m/Old/)
{
$line = s/Old/New/;
print FILE $line
}
close FILE;

Here ya go:
local $^I = '.bak';
local #ARGV = ($filename);
local $_;
my $replaced = 0;
while (<>) {
if (!$replaced && /Begin_search_here/ .. $replaced) {
$replaced = s/Old/New/;
}
print;
}
Explanation:
Setting the $^I variable enables inplace editing, just as if you had run perl with the -i flag. The original file will be saved with the same name as the original file, but with the extension ".bak"; replace ".bak" with "" if you don't want a backup made.
#ARGV is set to the list of files to do inplace editing on; here just your single file named in the variable $filename.
$_ is localized to prevent overwriting this commonly-used variable in the event this code snippet occurs in a subroutine.
The flip-flop operator .. is used to figure out what part of the file to perform substitutions in. It will be false until the first time a line matching the pattern Begin_search_here is encountered, and then will remain true until the first time a substitution occurs (as recorded in the variable $replaced), when it will turn off.

You would probably be best served by opening the input file in read mode (open( my $fh, '<', $file ) or die ...;), and writing the modified text to a temporary output file, then copying the temporary file overtop of the input file when you're done doing your processing.

You are misusing the random-access file mode. By the time you update $line and say print FILE $line, the "cursor" of your filehandle is already positioned at the beginning of the next line. So the original line is not changed and the next line is over-written, instead of overwriting the original line.
Inplace editing (see perlrun) looks like it would be well suited for this problem.
Otherwise, you need to read up on the tell function to save your file position before you read a line and seek back to that position before you rewrite the line. Oh, and the data that you write must be exactly the same size as the data you are overwriting, or you will totally fubar your file -- see this question.

I have done a number of edits like this, that I came up with a generic (yet stripped-down) strategy:
use strict;
use warnings;
use English qw<$INPLACE_EDIT>;
use Params::Util qw<_CODE>;
local $INPLACE_EDIT = '.bak';
local #ARGV = '/path/to/file';
my #line_actions
= ( qr/^Begin_search_here/
, qr/^Old Data/ => sub { s/^Old/New/ }
);
my $match = shift #line_actions;
while ( <> ) {
if ( $match and /$match/ ) {
if ( _CODE( $line_actions[0] )) {
shift( #line_actions )->( $_ );
}
$match = shift #line_actions;
}
print;
}

This works. It will, as you specified, only replaces one occurrence.
#! /usr/bin/perl -pi.bak
if (not $match_state) {
if (/Begin_search_here/) {
$match_state = "accepting";
}
}
elsif ($match_state eq "accepting") {
if (s/Old/New/) {
$match_state = "done";
}
}

Be very careful about editing a file in place. If the data you're replacing is a different length, you wreck the file. Also, if your program fails in the middle, you end up with a destroyed file.
Your best bet is to read in each line, process the line, and write each line to a new file. This will even allow you to run your program, examine the output, and if you have an error, fix it and rerun the program. Then, once everything is okay, add in the step to move your new file to the old name.
I've been using Perl since version 3.x, and I can't think of a single time I modified a file in place.
use strict;
use warnings;
open (INPUT, "$oldfile") or die qq(Can't open file "$oldFile" for reading);
open (OUTPUT, "$oldfile.$$") or die qq(Can't open file "$oldfile.$$" for writing);
my $startFlag = 0;
while (my $line = <INPUT>) {
if ($line ~= /Begin_search_here/) {
$startFlag = 1;
}
if ($startFlag) {
$line =~ s/New/Old/;
}
print OUTPUT "$line";
}
#
# Only implement these two steps once you've tested your program
#
unlink $oldfile;
rename $oldfile.$$", $oldfile;

Related

How can I know if diamond operator moved to the next file?

I have the following code in a file perl_script.pl:
while (my $line = <>) {
chomp $line;
// etc.
}.
I call the script with more than 1 file e.g.
perl perl_script.pl file1.txt file2.txt
Is there a way to know if the $line is started to read from file2.txt etc?
The $ARGV variable
Contains the name of the current file when reading from <>
and you can save the name and test on every line to see if it changed, updating when it does.
If it is really just about getting to a specific file, as the question seems to say, then it's easier since you can also use #ARGV, which contains command-line arguments, to test directly for the needed name.
One other option is to use eof (the form without parenthesis!) to test for end of file so you'll know that the next file is coming in the next iteration -- so you'll need a flag of some sort as well.
A variation on this is to explicitly close the filehandle at the end of each file so that $. gets reset for each new file, what normally doesn't happen for <>, and then $. == 1 is the first line of a newly opened file
while (<>) {
if ($. == 1) { say "new file: $ARGV" }
}
continue {
close ARGV if eof;
}
A useful trick which is documented in perldoc -f eof is the } continue { close ARGV if eof } idiom on a while (<>) loop. This causes $. (input line number) to be reset between files of the ARGV iteration, meaning that it will always be 1 on the first line of a given file.
There's the eof trick, but good luck explaining that to people. I usually find that I want to do something with the old filename too.
Depending on what you want to do, you can track the filename you're
working on so you can recognize when you change to a new file. That way
you know both names at the same time:
use v5.10;
my %line_count;
my $current_file = $ARGV[0];
while( <> ) {
if( $ARGV ne $current_file ) {
say "Change of file from $current_file to $ARGV";
$current_file = $ARGV;
}
$line_count{$ARGV}++
}
use Data::Dumper;
say Dumper( \%line_count );
Now you see when the file changes, and you can use $ARGV
Change of file from cache-filler.pl to common.pl
Change of file from common.pl to wc.pl
Change of file from wc.pl to wordpress_posts.pl
$VAR1 = {
'cache-filler.pl' => 102,
'common.pl' => 13,
'wordpress_posts.pl' => 214,
'wc.pl' => 15
};
Depending what I'm doing, I might not let the diamond operator do all
the work. This give me a lot more control over what's happening and
how I can respond to things:
foreach my $arg ( #ARGV ) {
next unless open my $fh, '<', $arg;
while( <$fh> ) {
...
}
}

How do I copy a CSV file, but skip the first line?

I want to write a script that takes a CSV file, deletes its first row and creates a new output csv file.
This is my code:
use Text::CSV_XS;
use strict;
use warnings;
my $csv = Text::CSV_XS->new({sep_char => ','});
my $file = $ARGV[0];
open(my $data, '<', $file) or die "Could not open '$file'\n";
my $csvout = Text::CSV_XS->new({binary => 1, eol => $/});
open my $OUTPUT, '>', "file.csv" or die "Can't able to open file.csv\n";
my $tmp = 0;
while (my $line = <$data>) {
# if ($tmp==0)
# {
# $tmp=1;
# next;
# }
chomp $line;
if ($csv->parse($line)) {
my #fields = $csv->fields();
$csvout->print($OUTPUT, \#fields);
} else {
warn "Line could not be parsed: $line\n";
}
}
On the perl command line I write: c:\test.pl csv.csv and it doesn't create the file.csv output, but when I double click the script it creates a blank CSV file. What am I doing wrong?
Your program isn't ideally written, but I can't tell why it doesn't work if you pass the CSV file on the command line as you have described. Do you get the errors Could not open 'csv.csv' or Can't able to open file.csv? If not then the file must be created in your current directory. Perhaps you are looking in the wrong place?
If all you need to do is to drop the first line then there is no need to use a module to process the CSV data - you can handle it as a simple text file.
If the file is specified on the command line, as in c:\test.pl csv.csv, you can read from it without explicitly opening it using the <> operator.
This program reads the lines from the input file and prints them to the output only if the line counter (the $. variable) isn't equal to one).
use strict;
use warnings;
open my $out, '>', 'file.csv' or die $!;
while (my $line = <>) {
print $out $line unless $. == 1;
}
Yhm.. you don't need any modules for this task, since CSV ( comma separated value ) are simply text files - just open file, and iterate over its lines ( write to output all lines except particular number, e.g. first ). Such task ( skip first line ) is so simple, that it would be probably better to do it with command line one-liner than a dedicated script.
quick search - see e.g. this link for an example, there are numerous tutorials about perl input/output operations
http://learn.perl.org/examples/read_write_file.html
PS. Perl scripts ( programs ) usually are not "compiled" into binary file - they are of course "compiled", but, uhm, on the fly - that's why /usr/bin/perl is called rather "interpreter" than "compiler" like gcc or g++. I guess what you're looking for is some editor with syntax highlighting and other development goods - you probably could try Eclipse with perl plugin for that ( cross platform ).
http://www.eclipse.org/downloads/
http://www.epic-ide.org/download.php/
this
user#localhost:~$ cat blabla.csv | perl -ne 'print $_ if $x++; '
skips first line ( prints out only if variable incremented AFTER each use of it is more than zero )
You are missing your first (and only) argument due to Windows.
I think this question will help you: #ARGV is empty using ActivePerl in Windows 7

Read the last line of file with data in Perl

I have a text file to parse in Perl. I parse it from the start of file and get the data that is needed.
After all that is done I want to read the last line in the file with data. The problem is that the last two lines are blank. So how do I get the last line that holds any data?
If the file is relatively short, just read on from where you finished getting the data, keeping the last non-blank line:
use autodie ':io';
open(my $fh, '<', 'file_to_read.txt');
# get the data that is needed, then:
my $last_non_blank_line;
while (my $line = readline $fh) {
# choose one of the following two lines, depending what you meant
if ( $line =~ /\S/ ) { $last_non_blank_line = $line } # line isn't all whitespace
# if ( line !~ /^$/ ) { $last_non_blank_line = $line } # line has no characters before the newline
}
If the file is longer, or you may have passed the last non-blank line in your initial data gathering step, reopen it and read from the end:
my $backwards = File::ReadBackwards->new( 'file_to_read.txt' );
my $last_non_blank_line;
do {
$last_non_blank_line = $backwards->readline;
} until ! defined $last_non_blank_line || $last_non_blank_line =~ /\S/;
perl -e 'while (<>) { if ($_) {$last = $_;} } print $last;' < my_file.txt
You can use the module File::ReadBackwards in the following way:
use File::ReadBackwards ;
$bw = File::ReadBackwards->new('filepath') or
die "can't read file";
while( defined( $log_line = $bw->readline ) ) {
print $log_line ;
exit 0;
}
If they're blank, just check $log_line for a match with \n;
If the file is small, I would store it in an array and read from the end. If its large, use File::ReadBackwards module.
Here's my variant of command line perl solution:
perl -ne 'END {print $last} $last= $_ if /\S/' file.txt
No one mentioned Path::Tiny. If the file size is relativity small you can do this:
use Path::Tiny;
my $file = path($file_name);
my ($last_line) = $file->lines({count => -1});
CPAN page.
Just remember for the large file, just as #ysth said it's better to use File::ReadBackwards. The difference can be substantial.
sometimes it is more comfortable for me to run shell commands from perl code. so I'd prefer following code to resolve the case:
$result=`tail -n 1 /path/file`;

Perl nested loops

I have a text file with the following contents:
NW1 SN1 DEV1
NW2 SN1 DEV2
I wrote a Perl script to iterate over the file, but it is running only once. The code is:
open(INPUT1,"input.txt");
#input_array = <INPUT1>;
for($i=0;$i<#input_array;$i++)
{
my ($ser,$node,#dev)=split(/ +/,$input_array[$i]);
for($x=0;$x<#dev;$x++)
{
print("Hi");
}
}
The script is iterating for the first line but not iterating for second line.
The code you posted could be improved, and brought up to more modern standards.
It uses a bareword filehandle INPUT1.
It doesn't use 3-arg open.
It doesn't use strict or warnings (see this question).
It doesn't check the return value of open or close. ( That's what the autodie line is for in the following code )
It uses C-style for loops when it doesn't need to.
It loads the entire file into memory even though it only deals with the file one line at a time.
use strict;
use warnings;
use autodie; # checks return value of open and close for us
# 3 arg open
open( my $in_fh, '<', 'input.txt' );
# don't read the file into memory until needed
while( <$in_fh> ){
# using $_ simplified this line
my ($ser,$node,#dev) = split;
# no need to track the indices just loop over the array
for my $dev (#dev){
print "Hi\n";
}
}
close $in_fh;
If for some reason you really did need the indices of the #dev array it would be better to write it like this:
for my $x ( 0..$#dev ){
...
}
If you want to explicitly store the line into a variable of a different name, you would change the while loop to this:
while( my $line = <$in_fh> ){
my ($ser,$node,#dev) = split / +/ $line;
...
}
You forgot to enter to slurp mode of the '<>' operator. To suck in complete file you should do this:
open(INPUT1,"input.txt");
undef $/;
#input_array = split(/\r?\n/, <INPUT1>);
close INPUT1;
or better yet like this:
open(INPUT1,"input.txt");
while(<INPUT1>) {
chomp;
my ($ser,$node,#dev)=split(/ +/,$_);
for($x=0;$x<#dev;$x++)
{
print("Hi");
}
}
close INPUT1;

Need Perl inplace editing of files not on command line

I have a program that has a number of filenames configured internally. The program edits a bunch of configuration files associated with a database account, and then changes the database password for the database account.
The list of configuration files is associated with the name of the database account via an internal list. When I process these files, I have the following loop in my program:
BEGIN { $^I = '.oldPW'; } # Enable in-place editing
...
foreach (#{$Services{$request}{'files'}})
{
my $filename = $Services{$request}{'configDir'} . '/' . $_;
print "Processing ${filename}\n";
open CONFIGFILE, '+<', $filename or warn $!;
while (<CONFIGFILE>)
{
s/$oldPass/$newPass/;
print;
}
close CONFIGFILE;
}
The problem is, this writes the modified output to STDOUT, not CONFIGFILE. How do I get this to actually edit inplace? Move the $^I inside the loop? Print CONFIGFILE? I'm stumped.
>
Update: I found what I was looking for on PerlMonks. You can use a local ARGV inside the loop to do inplace editing in the normal Perl way. The above loop now looks like:
foreach (#{$Services{$request}{'files'}})
{
my $filename = $Services{$request}{'configDir'} . '/' . $_;
print "Processing ${filename}\n";
{
local #ARGV = ( $filename);
while (<>)
{
s/$oldPass/$newPass/;
print;
}
}
}
If it weren't for tacking the configDir on the beginning, I could just toss the whole list into the local #ARGV, but this is efficient enough.
Thanks for the helpful suggestions on Tie::File. I'd probably go that way if doing this over. The configuration files I'm editing are never more than a few KB in length, so a Tie wouldn't use too much memory.
The recent versions of File::Slurp provide convenient functions, edit_file and edit_file_lines. The inner part of your code would look:
use File::Slurp qw(edit_file);
edit_file { s/$oldPass/$newPass/g } $filename;
The $^I variable only operates on the sequence of filenames held in $ARGV using the empty <> construction. Maybe something like this would work:
BEGIN { $^I = '.oldPW'; } # Enable in-place editing
...
local #ARGV = map {
$Services{$request}{'configDir'} . '/' . $_
} #{$Services{$request}{'files'}};
while (<>) {
s/$oldPass/$newPass/;
# print? print ARGVOUT? I don't remember
print ARGVOUT;
}
but if it's not a simple script and you need #ARGV and STDOUT for other purposes, you're probably better off using something like Tie::File for this task:
use Tie::File;
foreach (#{$Services{$request}{'files'}})
{
my $filename = $Services{$request}{'configDir'} . '/' . $_;
# make the backup yourself
system("cp $filename $filename.oldPW"); # also consider File::Copy
my #array;
tie #array, 'Tie::File', $filename;
# now edit #array
s/$oldPass/$newPass/ for #array;
# untie to trigger rewriting the file
untie #array;
}
Tie::File has already been mentioned, and is very simple. Avoiding the -i switch is probably a good idea for non-command-line scripts. If you're looking to avoid Tie::File, the standard solution is this:
Open a file for input
Open a temp file for output
Read a line from input file.
Modify the line in whatever way you like.
Write the new line out to your temp file.
Loop to next line, etc.
Close input and output files.
Rename input file to some backup name, such as appending .bak to filename.
Rename temporary output file to original input filename.
This is essentially what goes on behind the scenes with the -i.bak switch anyway, but with added flexibility.