how to point this script to one folder for reading and another for writing - perl

I can't seem to get this script to open from one directory and write to another. Both Directories exist. I've commented out what I tried. Funny this is it runs fine when I place it in the directory with the files to process. Here's the code:
use strict;
use warnings "all";
my $tmp;
my $dir = ".";
#my $dir = "Ask/Parsed/Html4/";
opendir(DIR, $dir) or die "Cannot open directory: $dir!\n";
my #files = readdir(DIR);
closedir(DIR);
open my $out, ">>output.txt" or die "Cannot open output.txt!\n";
#open my $out, ">>Ask/Parsed/Html5/output.txt" or die "Cannot open output.txt!\n";
foreach my $file (#files)
{
if($file =~ /html$/)
{
open my $in, "<$file" or die "Cannot open $file!\n";
undef $tmp;
while(<$in>)
{
$tmp .= $_;
}
print $out ">$file\n";
print $out "$tmp\n";
#print $out "===============";
close $in;
}
}
close $out;

The directories you use -- . and Ask/Parsed/Html4/ -- are relative paths, which means they are relative to your current working directory, and so it makes a difference where in the file system you are currently located when you run the script.
In addition, the files you are opening -- output.txt and $file -- have no path information, so Perl will look in your current working directory to find them.
There are a few ways to solve this.
You could cd to the directory where your files are before running the script, and open the directory as . as you currently do
You could achieve the same effect by calling chdir from within the script, which will change the current working directory and make the program ignore your location when you ran it
Or you could add an absolute directory path to the beginning of the file names, preferably using catfile from File::Spec::Functions
However I would choose to use glob -- which works in the same way as command-line filename expansion -- in preference to opendir / readdir as the resulting strings include the path (if one was specified in the parameter) and there is no need to separately filter the .html files.
I would also choose to undefine the input record separator $/ to read the whole file, rather than reading it line-by-line and concatenating them all.
Finally, if you are running version 10 or later of Perl 5 then it is simpler to use autodie rather than checking the success of every open, readline, close, opendir, readdir, and closedir etc.
Something like this
use strict;
use warnings 'all';
use 5.010;
use autodie;
my $dir = '/path/to/Ask/Parsed/Html4';
my #html = glob "$dir/*.html";
open my $out, '>>', "$dir/output.txt";
for my $file (#html) {
my $contents = do {
open my $in, '<', $file;
local $/;
<$in>;
};
print $out "> $file\n";
print $out "$contents\n";
print $out "===============";
}
close $out;

It is likely trying to access the files from where ever you are calling this from. If you're files are located relative to the location of the script use the following example to provide a full path;
use FindBin;
my $file = "$FindBin::Bin/Ask/Parsed/Html5/output.txt";
If your file us not relative to the script, provide the full path;
my $file = "/home/john.doe/Ask/Parsed/Html5/output.txt";

Note that readdir() only returns the file name. If you want to open it prepend the directory
eg
open my $in, "<", "$dir/$file" or die "Cannot open $file!\n";
Note that best practice says you should be using the three parameter version of open, otherwise

Related

What order does Perl use by default to read all files in a directory?

I have .gz files inside a directory and I am reading them with Perl. Everything is ok but what I don't understand is the order in which this files are being read. For sure, I can tell that it is not alphabetical. So my question is what order does Perl use by default to read files from a directory.
Below is a snippet of my code
# Open the source file
my $dir = "/home/myname/mydir";
# Open directory and loop through
opendir(DIR, $dir) or die $!;
while (my $file = readdir(DIR)) {
# We only want files
next unless (-f "$dir/$file");
# Use a regular expression to find files ending in .gz
next unless ($file =~ m/\.gz$/);
my $gzip_file = "./mydir/$file";
open ( my $gunzip_stream, "-|", "gzip -dc $gzip_file") or die $!;
while (my $line = <$gunzip_stream> ) {
print ("$line\n");
}
}
readdir returns the files in the same order as the system returns them. I'm not aware of any guarantee of order from any OS. I imagine different drives might even behave differently.

How to extend a program that works for one file to act on every file in a directory

I wrote a program to check for misspellings or unused data in a text file. Now, I want to check all files in a directory using the same process.
Here are a few lines of the script that I run for the first file:
open MYFILE, 'checking1.txt' or die $!;
#arr_file = <MYFILE>;
close (MYFILE);
open FILE_1, '>text1' or die $!;
open FILE_2, '>Output' or die $!;
open FILE_3, '>Output2' or die $!;
open FILE_4, '>text2' or die $!;
for ($i = 0; $i <= $#arr_file; $i++) {
if ( $arr_file[$i-1] =~ /\s+\S+\_name\s+ (\S+)\;/ ) {
print FILE_1 "name : $i $1\n";
}
...
I used only one file, checking1.txt, to execute the script, but now I want to do the same process for all files in the all_file_directory
Use an array to store file names and then loop over them. At the end of loop rename output files or copy them somewhere so that they do not get overwritten in next iteration.
my #files = qw(checking1.txt checking2.txt checking3.txt checking4.txt checking5.txt);
foreach my $filename (#files){
open (my $fh, "<", $filename) or die $!;
#perform operations on $filename using filehandle $fh
#rename output files
}
Now for the above to work you need to make sure the files are in the same directory. If not then:
Provide absolute path to each file in #files array
Traverse directory to find desired files
If you want to traverse the directory then see:
How do I read in the contents of a directory in Perl?
How can I recursively read out directories in Perl?
Also:
Use 3 args open
Always use strict; use warnings; in your Perl program
and give proper names to the variables. For eg:
#arr_file = <MYFILE>;
should be written as
#lines = <MYFILE>;
Your all files in same directory means put the program inside the directory then run it.
For read the file from a directory use glob
while (my $filename =<*.txt>) # change the file extension whatever you want
{
open my $fh, "<" , $filename or die "Error opening $!\n";
#do your stuff here
}
Why not useFile::Find? Makes changing files in directories very easy. Just supply the start directory.
It's not always the best choice, depends on your needs, but it's useful and easy almost every time I need to modify a lot of files all at once.
Another option is to just loop through the files, but in this case you'll have to supply the file names.
As mkHun pointed out a glob can be helpful.

Perl multiple files read replacing string, write to multiple files

I have this code working, but this is only for one file with specific name, how can I let it does all .vb file in current folder and output with file name plus _1 in the back
#!/usr/bin/perl
use strict;
use warnings;
open my $fhIn, '<', 'file.vb' or die $!;
open my $fhOut, '>', 'file_1.vb' or die $!;
while (<$fhIn>) {
print $fhOut "'01/20/2016 Added \ngpFrmPosition = Me.Location\n" if /MessageBox/ and !/'/;
print $fhOut $_;
}
close $fhOut;
close $fhIn;
I might approach it like this. (This assumes the script is running in the same directory as the .vb files).
#!/usr/bin/perl
use strict;
use warnings;
# script running in same directory as the .vb files
for my $file (glob "*.vb") {
my $outfile = $file =~ s/(?=\.vb$)/_1/r;
print "$file $outfile\n"; # DEBUG
# open input and output files
# do the while loop
}
The print statement in the loop is for debug purposes - to see if you are creating the new file names correctly. You can delete it or comment it out when you are satisfied you have got the files you want.
Update: Put the glob in the for loop instead of reading it to an array.

Rename all .txt files in a directory and then open that file in perl

I need some help with file manipulations and need some expert advice.
It looks like I am making a silly mistake somewhere but I can't catch it.
I have a directory that contains files with a .txt suffix, for example file1.txt, file2.txt, file3.txt.
I want to add a revision string, say rev0, to each of those files and then open the modified files. For instance rev0_file1.txt, rev0_file2.txt, rev0_file3.txt.
I can append rev0, but my program fails to open the files.
Here is the relevant portion of my code
my $dir = "path to my directory";
my #::tmp = ();
opendir(my $DIR, "$dir") or die "Can't open directory, $!";
#::list = readdir($DIR);
#::list2 = sort grep(/^.*\.txt$/, #::list);
foreach (#::list2) {
my $new_file = "$::REV0" . "_" . "$_";
print "new file is $new_file\n";
push(#::tmp, "$new_file\n");
}
closedir($DIR);
foreach my $cur_file (<#::tmp>) {
$cur_file = $_;
print "Current file name is $cur_file\n"; # This debug print shows nothing
open($fh, '<', "$cur_file") or die "Can't open the file\n"; # Fails to open file;
}
Your problem is here:
foreach my $cur_file(<#::tmp>) {
$cur_file = $_;
You are using the loop variable $cur_file, but you overwrite it with $_, which is not used at all in this loop. To fix this, just remove the second line.
Your biggest issue is the fact you are using $cur_file in your loop for the file name, but then reassign it with $_ even though $_ won't have a value at that point. Also, as Borodin pointed out, $::REV0 was never defined.
You can use the move command from the File::Copy to move the files, and you can use File::Find to find the files you want to move:
use strict;
use warnings;
use feature qw(say);
use autodie;
use File::Copy; # Provides the move command
use File::Find; # Gives you a way to find the files you want
use constant {
DIRECTORY => '/path/to/directory',
PREFIX => 'rev0_',
};
my #files_to_rename;
find (
sub {
next unless /\.txt$/; # Only interested in ".txt" files
push #files_to_rename, $File::Find::name;
}, DIRECTORY );
for my $file ( #files_to_rename ) {
my $new_name = PREFIX . $file;
move $file, $new_name;
$file = $new_name; # Updates #files_to_rename with new name
open my $file_fh, "<", $new_name; # Open the file here?
...
close $file_fh;
}
for my $file ( #files_to_rename ) {
open my $file_fh, "<", $new_name; # Or, open the file here?
...
close $file_fh;
}
See how using Perl modules can make your task much easier? Perl comes with hundreds of pre-installed packages to handle zip files, tarballs, time, email, etc. You can find a list at the Perldoc page (make sure you select the version of Perl you're using!).
The $file = $new_name is actually changing the value of the file name right inside the #files_to_rename array. It's a little Perl trick. This way, your array refers to the file even through it has been renamed.
You have two choices where to open the file for reading: You can rename all of your files first, and then loop through once again to open each one, or you can open them after you rename them. I've shone both places.
Don't use $:: at all. This is very bad form since it overrides use strict; -- that is if you're using use strict to begin with. The standard is not to use package variables (aka global variables) unless you have to. Instead, you should use lexically scoped variables (aka local variables) defined with my.
One of the advantages of the my variable, I really don't need the close command since the variable falls out of scope with each iteration of the loop and disappears entirely once the loop is complete. When the variable that contains the file handle falls out of scope, the file handle is automatically closed.
Always include use strict;, use warnings at the top of EVERY script. And use autodie; anytime you're doing file or directory processing.
There is no reason why you should be prefixing your variables with :: so please simplify your code like the following:
use strict;
use warnings;
use autodie;
use File::Copy;
my $dir = "path to my directory";
chdir($dir); # Make easier by removing the need to prefix path information
foreach my $file (glob('*.txt')) {
my $newfile = 'rev0_'.$file;
copy($file, $newfile) or die "Can't copy $file -> $newfile: $!";
open my $fh, '<', $newfile;
# File processing
}
What you've attempted to store is the updated name of the file in #::tmp. The file hasn't been renamed, so it's little surprise that the code died because it couldn't find the renamed file.
Since it's just renaming, consider the following code:
use strict;
use warnings;
use File::Copy 'move';
for my $file ( glob( "file*.txt" ) ) {
move( $file, "rev0_$file" )
or die "Unable to rename '$file': $!";
}
From a command line/terminal, consider the rename utility if it is available:
$ rename file rev0_file file*.txt

Write to a file in Perl

Consider:
#!/usr/local/bin/perl
$files = "C:\\Users\\A\\workspace\\CCoverage\\backup.txt";
unlink ($files);
open (OUTFILE, '>>$files');
print OUTFILE "Something\n";
close (OUTFILE);
The above is a simple subroutine I wrote in Perl, but it doesn't seem to work. How can I make it work?
Variables are interpolated only in strings using double quotes ". If you use single quotes ' the $ will be interpreted as a dollar.
Try with ">>$files" instead of '>>$files'
Always use
use strict;
use warnings;
It will help to get some more warnings.
In any case also declare variables
my $files = "...";
You should also check the return value of open:
open OUTFILE, ">>$files"
or die "Error opening $files: $!";
Edit: As suggested in the comments, a version with the three arguments open and a couple of other possible improvements
#!/usr/bin/perl
use strict;
use warnings;
# warn user (from perspective of caller)
use Carp;
# use nice English (or awk) names for ugly punctuation variables
use English qw(-no_match_vars);
# declare variables
my $files = 'example.txt';
# check if the file exists
if (-f $files) {
unlink $files
or croak "Cannot delete $files: $!";
}
# use a variable for the file handle
my $OUTFILE;
# use the three arguments version of open
# and check for errors
open $OUTFILE, '>>', $files
or croak "Cannot open $files: $OS_ERROR";
# you can check for errors (e.g., if after opening the disk gets full)
print { $OUTFILE } "Something\n"
or croak "Cannot write to $files: $OS_ERROR";
# check for errors
close $OUTFILE
or croak "Cannot close $files: $OS_ERROR";