Script to remove first line from all the text files in a directory - perl

I'm trying to write a Perl script which reads all the text files in a directory and writes all the lines except first to a separate file. If there are 3 files, I want the script to read all those 3 files and write 3 new files with same lines except the first. This is what I wrote.. but when I try to run the script, it executes fine with no errors but doesn't do the work it is supposed to. Can someone please look into it?
opendir (DIR, "dir\\") or die "$!";
my #files = grep {/*?\.txt/} readdir DIR;
close DIR;
my $count=0;
my $lc;
foreach my $file (#files) {
$count++;
open(FH,"dir\\$file") or die "$!";
$str="dir\\example_".$count.".txt";
open(FH2,">$str");
$lc=0;
while($line = <FH>){
if($lc!=0){
print FH2 $line;
}
$lc++;
}
close(FH);
close(FH2);
}
And the second file doesn't exists, it is supposed to be created by script.

Try changing these lines
opendir (DIR, "dir\\") or die "$!";
...
close DIR;
to
opendir (DIR, "dir") or die "$!";
...
closedir DIR;
I tried running your code locally and the only two issues I had were with the directory name containing the trailing slash and trying to use the filehandle close() function on a dirhandle.

If you have the list of files ...
foreach my $file ( #files ) {
open my $infile , '<' , "dir/$file" or die "$!" ;
open my $outfile , '>' , "dir/example_" . ++${counter} . '.txt' or die "$!" ;
<$infile>; # Skip first line.
while( <$infile> ) {
print $outfile $_ ;
}
}
The lexical filehandles will be closed automatically when going out of scope.

Not sure why you're using $count here, as that's going to just turn a list of files like:
01.txt
bob.txt
alice.txt
02.txt
into:
01_1.txt
bob_2.txt
alice_3.txt
02_4.txt
Keep in mind, #files isn't being sorted, so it will return in the order the files exist in the directory table. If you were to delete and re-create the file 01.txt, it would be moved to the end of the list, re-ordering the whole set:
bob_1.txt
alice_2.txt
02_3.txt
01_4.txt
Since that wasn't really part of your original question, this does exactly what you asked to do:
#!/usr/bin/perl
while(<*.txt>) { # for every file in the *.txt glob from the current directory
open(IN, $_) or die ("Cannot open $_: $!"); # open file for reading
my #in = <IN>; # read the contents into an array
close(IN); # close the file handle
shift #in; # remove the first element from the array
open(OUT, ">$_.new") or die ("Cannot open $_.new: $!"); # open file for writing
print OUT #in; # write the contents of the array to the file
close(OUT); # close the file handle
}

Related

Search a word in file and replace in Perl

I want to replace word "a" to "red" in a.text files. I want to edit the same file so I tried this code but it does not work. Where am I going wrong?
#files=glob("a.txt");
foreach my $file (#files)
{
open(IN,$file) or die $!;
<IN>;
while(<IN>)
{
$_=~s/a/red/g;
print IN $file;
}
close(IN)
}
I'd suggest it's probably easier to use perl in sed mode:
perl -i.bak -p -e 's/a/red/g' *.txt
-i is inplace edit (-i.bak saves the old as .bak - -i without a specifier doesn't create a backup - this is often not a good idea).
-p creates a loop that iterates all the files specified one line at a time ($_), applying whatever code is specified by -e before printing that line. In this case - s/// applies a sed-style patttern replacement to $_, so this runs a search and replace over every .txt file.
Perl uses <ARVG> or <> to do some magic - it checks if you specify files on your command line - if you do, it opens them and iterates them. If you don't, it reads from STDIN.
So you can also do:
somecommand.sh | perl -i.bak -p -e 's/a/red/g'
In your code you are using same filehandle to write which you have used for open the file to reading. Open the same file for write mode and then write.
Always use lexical filehandle and three arguments to open a file. Here is your modified code:
use warnings;
use strict;
my #files = glob("a.txt");
my #data;
foreach my $file (#files)
{
open my $fhin, "<", $file or die $!;
<$fhin>;
while(<$fhin>)
{
$_ =~ s/\ba\b/red/g;
push #data, $_;
}
open my $fhw, ">", $file or die "Couldn't modify file: $!";
print $fhw #data;
close $fhw;
}
Here is another way (read whole file in a scalar):
foreach my $file (glob "/path/to/dir/a.txt")
{
#read whole file in a scalar
my $data = do {
local $/ = undef;
open my $fh, "<", $file or die $!;
<$fh>;
};
$data =~ s/\ba\b/red/g; #replace a with red,
#modify the file
open my $fhw, ">", $file or die "Couldn't modify file: $!";
print $fhw $data;
close $fhw;
}

Search string with multiple words in the pattern

My program is trying to search a string from multiple files in a directory. The code searches for single patterns like perl but fails to search a long string like Status Code 1.
Can you please let me know how to search for strings with multiple words?
#!/usr/bin/perl
my #list = `find /home/ad -type f -mtime -1`;
# printf("Lsit is $list[1]\n");
foreach (#list) {
# print("Now is : $_");
open(FILE, $_);
$_ = <FILE>;
close(FILE);
unless ($_ =~ /perl/) { # works, but fails to find string "Status Code 1"
print "found\n";
my $filename = 'report.txt';
open(my $fh, '>>', $filename) or die "Could not open file '$filename' $!";
say $fh "My first report generated by perl";
close $fh;
} # end unless
} # end For
There are a number of problems with your code
You must always use strict and use warnings at the top of every Perl program. There is little point in delcaring anything with my without strict in place
The lines returned by the find command will have a newline at the end which must be removed before Perl can find the files
You should use lexical file handles (my $fh instead of FILE) and the three-parameter form of open as you do with your output file
$_ = <FILE> reads only the first line of the file into $_
unless ($_ =~ /perl/) is inverted logic, and there's no need to specify $_ as it is the default. You should write if ( /perl/ )
You can't use say unless you have use feature 'say' at the top of your program (or use 5.010, which adds all features available in Perl v5.10)
It is also best to avoid using shell commands as Perl is more than able to do anything that you can using command line utilities. In this case -f $file is a test that returns true if the file is a plain file, and -M $file returns the (floating point) number of days since the file's modification time
This is how I would write your program
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
for my $file ( glob '/home/ad/*' ) {
next unless -f $file and int(-M $file) == 1;
open my $fh, '<', $file or die $!;
while ( <$fh> ) {
if ( /perl/ ) {
print "found\n";
my $filename = 'report.txt';
open my $out_fh, '>>', $filename or die "Could not open file '$filename': $!";
say $fh "My first report generated by perl";
close $out_fh;
last;
}
}
}
it should have matched unless $_ contains text in different case.
try this.
unless($_ =~ /Status\s+Code\s+1/i) {
Change
unless ($_ =~ /perl/) {
to:
unless ($_ =~ /(Status Code 1)/) {
I am certain the above works, except it's case sensitive.
Since you question it, I rewrote your script to make more sense of what you're trying to accomplish and implement the above suggestion. Correct me if I am wrong, but you're trying to make a script which matches "Status Code 1" in a bunch of files where last modified within 1 day and print the filename to a text file.
Anyways, below is what I recommend:
#!/usr/bin/perl
use strict;
use warnings;
my $output_file = 'report.txt';
my #list = `find /home/ad -type f -mtime -1`;
foreach my $filename (#list) {
print "PROCESSING: $filename";
open (INCOMING, "<$filename") || die "FATAL: Could not open '$filename' $!";
foreach my $line (<INCOMING>) {
if ($line =~ /(Status Code 1)/) {
open( FILE, ">>$output_file") or die "FATAL: Could not open '$output_file' $!";
print FILE sprintf ("%s\n", $filename);
close(FILE) || die "FATAL: Could not CLOSE '$output_file' $!";
# Bail when we get the first match
last;
}
}
close(INCOMING) || die "FATAL: Could not close '$filename' $!";
}

Perl - search and replace across multiple lines across multiple files in specified directory

At the moment this code replaces all occurences of my matching string with my replacement string, but only for the file I specify on the command line. Is there a way to change this so that all .txt files for example, in the same directory (the directory I specify) are processed without having to run this 100s of times on individual files?
#!/usr/bin/perl
use warnings;
my $filename = $ARGV[0];
open(INFILE, "<", $filename) or die "Cannot open $ARGV[0]";
my(#fcont) = <INFILE>;
close INFILE;
open(FOUT,">$filename") || die("Cannot Open File");
foreach $line (#fcont) {
$line =~ s/\<br\/\>\n([[:space:]][[:space:]][[:space:]][[:space:]][A-Z])/\n$1/gm;
print FOUT $line;
}
close INFILE;
I have also tried this:
perl -p0007i -e 's/\<br\/\>\n([[:space:]][[:space:]][[:space:]][[:space:]][A-Z])/\n$1/m' *.txt
But have noticed that is only changes the first occurence of the matched pattern and ignores all the rest in the file.
I also have tried this, but it doesn't work in the sense that it just creates a blank file:
use v5.14;
use strict;
use warnings;
use DBI;
my $source_dir = "C:/Testing2";
# Store the handle in a variable.
opendir my $dirh, $source_dir or die "Unable to open directory: $!";
my #files = grep /\.txt$/i, readdir $dirh;
closedir $dirh;
# Stop script if there aren't any files in the list
die "No files found in $source_dir" unless #files;
foreach my $file (#files) {
say "Processing $source_dir/$file";
open my $in, '<', "$source_dir/$file" or die "Unable to open $source_dir/$file: $!\n";
open(FOUT,">$source_dir/$file") || die("Cannot Open File");
foreach my $line (#files) {
$line =~ s/\<br\/\>\n([[:space:]][[:space:]][[:space:]][[:space:]][A-Z])/\n$1/gm;
print FOUT $line;
}
close $in;
}
say "Status: Processing of complete";
Just wondering what am I missing from my code above? Thanks.
You could try the following:
opendir(DIR,"your_directory");
my #all_files = readdir(DIR);
closedir(DIR);
for (#all_files) .....

Perl loop through text files in a directory, locate file based on file name and print content

How do i loop through files in a directory, locate file based on file name and print file content?
Please see below code:
files in directory:
1234.txt
345.txt
234.txt
Code:
opendir (DIR, "LOCATION")|| die "cant open directory\n";
my #DATA = grep {(!/^\./)} readdir (DIR);
while ( my $file = shift #DATA) {
open FILE, "LOCATION";
while (FILE){
if ($file eq "235") {
print $_;
}
}
}
This should do (untested):
opendir( DIR, "/path/to/dir" );
while ( my $entry = readdir( DIR ) ) {
if ( $entry =~ /^$filenameImLookingFor$/ ) {
open( FILE, "$entry/$filenameImLookingFor" );
my #lines = <FILE>;
close( FILE );
print( join( '', #lines );
}
}
closedir( DIR );
The code in your question:
opendir (DIR, "LOCATION")|| die "cant open directory\n";
my #DATA = grep {(!/^\./)} readdir (DIR);
while ( my $file = shift #DATA) {
open FILE, "LOCATION";
while (FILE){
if ($file eq "235") {
print $_;
}
}
}
Will do this:
First it will handily find all files in directory "LOCATION" that do not begin with a period. Then it will iterate in a rather odd loop over each file name. The normal version of this loop would be:
for my $file (#DATA)
Then it will attempt to open the directory "LOCATION" again. This will likely fail, because "LOCATION" is a directory. Since you do not check the return value with die, this error will be silent.
What you probably want is to use
if ($file eq "235.txt") {
open my $fh, "<", $file or die $!;
print <$fh>;
}
This part:
while (FILE)
Is not actually checking the return value of readline(), it is checking whether the file handle is returning a true value. As near as I can tell on my system, it does return a true value even if the open failed. Which means of course that the loop will run indefinitely. What you probably meant was
while (<FILE>)
However, as explained earlier, this will only result in the error "readline() on unopened file handle FILE" since the open statement cannot open a directory.
Your check
if ($file eq "235")
Will never be true, since you said your file names had a .txt extension. You might instead do
if ($file eq "235.txt")
Which should work.
If you wanted to be clever, you could include your check directly in the grep:
my #files = grep { $_ eq "235.txt" } readdir DIR;
And since perl can use the <> diamond operator to print files listed in the #ARGV array, you can even do this
#ARGV = grep { $_ eq "235.txt" } #ARGV;
print <>;
Assuming you call the script with:
perl script.pl dir/*.txt
This is, of course, just the long version of doing:
perl -pe0 235.txt
Which is the long version of
cat 235.txt
So, I get the feeling you are trying to do something other than what your code implies.

Read all files in a directory in perl

in the first part of my code, I read a file and store different parts of it in different files in a directory and in the following I wanna read all the files in that directory that I build it in the first part of code:
while(<file>){
#making files in directory Dir
}
opendir(Dir, $indirname) or die "cannot open directory $indirname";
#docs = grep(/\.txt$/,readdir(Dir));
foreach $d (#Dir) {
$rdir="$indirname/$d";
open (res,$rdir) or die "could not open $rdir";
while(<res>){
}
but with this code, the last line of the last file wont be read
As I don't know what you are doing in the line reading loop and don't understand #docs and #Dir, I'll show code that 'works' for me:
use strict;
use warnings;
use English;
my $dir = './_tmp/readFID';
foreach my $fp (glob("$dir/*.txt")) {
printf "%s\n", $fp;
open my $fh, "<", $fp or die "can't read open '$fp': $OS_ERROR";
while (<$fh>) {
printf " %s", $_;
}
close $fh or die "can't read close '$fp': $OS_ERROR";
}
output:
./_tmp/readFID/123.txt
1
2
3
./_tmp/readFID/45.txt
4
5
./_tmp/readFID/678.txt
6
7
8
Perhaps you can spot a relevant difference to your script.
I modified the code slightly to just test the basic idea in a directory containing my perl programs and it does seem to work. You should be iterating through #docs instead of #dir though (and I highly recommend using both the strict and warnings pragmas).
opendir(DIR, ".") or die "cannot open directory";
#docs = grep(/\.pl$/,readdir(DIR));
foreach $file (#docs) {
open (RES, $file) or die "could not open $file\n";
while(<RES>){
print "$_";
}
}
glob does what you want, without the open/close stuff. And once you stick a group of files into #ARGV the "diamond" operator works as normal.
#ARGV = <$indirname/*.txt>;
while ( <> ) {
...
}