List file in directory and store in array. That array can be access outside the loop - perl

I want to list the files in the folder and want to store in array. How can I make the array can be access outside the loop? I need that array to be outside as need to use it outside the array
This is the code:
use strict;
use warnings;
my $dirname = "../../../experiment/";
my $filelist;
my #file;
open ( OUTFILE, ">file1.txt" );
opendir ( DIR, $dirname ) || die "Error in opening dir $dirname\n";
while ( $filelist = readdir (DIR) ) {
next if ( $filelist =~ m/\.svh$/ );
next if ( $filelist =~ m/\.sv$/ );
next if ( $filelist =~ m/\.list$/ );
push #fileInDir, $_;
}
closedir(DIR);
print OUTFILE " ", #fileInDir;
close OUTFILE;
The error message is
Use of uninitialized value in print at file.pl line 49.

You're pushing the unitialized variable $_ onto the array rather than $filelist, which is misleadingly named (it's just one file name).
You can use:
use strict;
use warnings;
my $dirname = "../../../experiment/";
my $fname = "file1.txt";
my #files;
open my $out_fh, ">", $fname or die "Error in opening file $fname: $!";
opendir my $dir, $dirname or die "Error in opening dir $dirname: $!";
while (my $file = readdir($dir)) {
next if ($file =~ m/\.svh$/);
next if ($file =~ m/\.sv$/);
next if ($file =~ m/\.list$/);
push #files, $file;
}
print $out_fh join "\n", #files;
closedir $dir;
close $out_fh;

Related

To parse multiple files in Perl

Please correct my code, I cannot seem to open my file to parse.
The error is this line open(my $fh, $file) or die "Cannot open file, $!";
Cannot open file, No such file or directory at ./sample.pl line 28.
use strict;
my $dir = $ARGV[0];
my $dp_dpd = $ENV{'DP_DPD'};
my $log_dir = $ENV{'DP_LOG'};
my $xmlFlag = 0;
my #fileList = "";
my #not_proc_dir = `find $dp_dpd -type d -name "NotProcessed"`;
#print "#not_proc_dir\n";
foreach my $dir (#not_proc_dir) {
chomp ($dir);
#print "$dir\n";
opendir (DIR, $dir) or die "Couldn't open directory, $!";
while ( my $file = readdir DIR) {
next if $file =~ /^\.\.?$/;
next if (-d $file);
# print "$file\n";
next if $file eq "." or $file eq "..";
if ($file =~ /.xml$/ig) {
$xmlFlag = 1;
print "$file\n";
open(my $fh, $file) or die "Cannot open file, $!";
#fileList = <$fh>;
close $file;
}
}
closedir DIR;
}
Quoting readdir's documentation:
If you're planning to filetest the return values out of a readdir, you'd better prepend the directory in question. Otherwise, because we didn't chdir there, it would have been testing the wrong file.
Your open(my $fh, $file) should therefore be open my $fh, '<', "$dir/$file" (note how I also added '<' as well: you should always use 3-argument open).
Your next if (-d $file); is also wrong and should be next if -d "$dir/$file";
Some additional remarks on your code:
always add use warnings to your script (in addition to use strict, which you already have)
use lexical file/directory handle rather than global ones. That is, do opendir my $DH, $dir, rather than opendir DH, $dir.
properly indent your code (if ($file =~ /.xml$/ig) { is one level too deep; it makes it harder to read you code)
next if $file =~ /^\.\.?$/; and next if $file eq "." or $file eq ".."; are redundant (even though not technically equivalent); I'd suggest using only the latter.
the variable $dir defined in my $dir = $ARGV[0]; is never used.

Data driven perl script

I want to list file n folder in directory. Here are the list of the file in this directory.
Output1.sv
Output2.sv
Folder1
Folder2
file_a
file_b
file_c.sv
But some of them, i don't want it to be listed. The list of not included file, I list in input.txt like below. Note:some of them is file and some of them is folder
NOT_INCLUDED=file_a
NOT_INCLUDED=file_b
NOT_INCLUDED=file_c.sv
Here is the code.
#!/usr/intel/perl
use strict;
use warnings;
my $input_file = "INPUT.txt";
open ( OUTPUT, ">OUTPUT.txt" );
file_in_directory();
close OUTPUT;
sub file_in_directory {
my $path = "experiment/";
my #unsort_output;
my #not_included;
open ( INFILE, "<", $input_file);
while (<INFILE>){
if ( $_ =~ /NOT_INCLUDED/){
my #file = $_;
foreach my $file (#file) {
$file =~ s/NOT_INCLUDED=//;
push #not_included, $file;
}
}
}
close INFILE;
opendir ( DIR, $path ) || die "Error in opening dir $path\n";
while ( my $filelist = readdir (DIR) ) {
chomp $filelist;
next if ( $filelist =~ m/\.list$/ );
next if ( $filelist =~ m/\.swp$/ );
next if ( $filelist =~ s/\.//g);
foreach $_ (#not_included){
chomp $_;
my $not_included = "$_";
if ( $filelist eq $not_included ){
next;
}
push #unsort_output, $filelist;
}
closedir(DIR);
my #output = sort #unsort_output;
print OUTPUT #output;
}
The output that I want is to list all the file in that directory except the file list in input.txt 'NOT_INCLUDED'.
Output1.sv
Output2.sv
Folder1
Folder2
But the output that i get seem still included that unwanted file.
This part of the code makes no sense:
while ( my $filelist = readdir (DIR) ) {
...
foreach $_ (#not_included){
chomp $_;
my $not_included = "$_";
if ( $filelist eq $not_included ){
next;
} # (1)
push #unsort_output, $filelist; # (2)
}
This code contains three opening braces ({) but only two closing braces (}). If you try to run your code as-is, it fails with a syntax error.
The push line (marked (2)) is part of the foreach loop, but indented as if it were outside. Either it should be indented more (to line up with (1)), or you need to add a } before it. Neither alternative makes much sense:
If push is outside of the foreach loop, then the next statement (and the whole foreach loop) has no effect. It could just be deleted.
If push is inside the foreach loop, then every directory entry ($filelist) will be pushed multiple times, one for each line in #not_included (except for the names listed somewhere in #not_included; those will be pushed one time less).
There are several other problems. For example:
$filelist =~ s/\.//g removes all dots from the file name, transforming e.g. file_c.sv into file_csv. That means it will never match NOT_INCLUDED=file_c.sv in your input file.
Worse, the next if s/// part means the loop skips all files whose names contain dots, such as Output1.sv or Output2.sv.
Results are printed without separators, so you'll get something like
Folder1Folder1Folder1Folder2Folder2Folder2file_afile_afile_bfile_b in OUTPUT.txt.
Global variables are used for no reason, e.g. INFILE and DIR.
Here is how I would structure the code:
#!/usr/intel/perl
use strict;
use warnings;
my $input_file = 'INPUT.txt';
my %is_blacklisted;
{
open my $fh, '<', $input_file or die "$0: $input_file: $!\n";
while (my $line = readline $fh) {
chomp $line;
if ($line =~ s!\ANOT_INCLUDED=!!) {
$is_blacklisted{$line} = 1;
}
}
}
my $path = 'experiment';
my #results;
{
opendir my $dh, $path or die "$0: $path: $!\n";
while (my $entry = readdir $dh) {
next
if $entry eq '.' || $entry eq '..'
|| $entry =~ /\.list\z/
|| $entry =~ /\.swp\z/
|| $is_blacklisted{$entry};
push #results, $entry;
}
}
#results = sort #results;
my $output_file = 'OUTPUT.txt';
{
open my $fh, '>', $output_file or die "$0: $output_file: $!\n";
for my $result (#results) {
print $fh "$result\n";
}
}
The contents of INPUT.txt (more specifically, the parts after NOT_INCLUDED=) are read into a hash (%is_blacklisted). This allows easy lookup of entries.
Then we process the directory entries. We skip over . and .. (I assume you don't want those) as well as all files ending with *.list or *.swp (that was in your original code). We also skip any file that is blacklisted, i.e. that was specified as excluded in INPUT.txt. The remaining entries are collected in #results.
We sort our results and write them to OUTPUT.txt, one entry per line.
Not deviating too much from your code, here is the solution. Please find the comments:
#!/usr/intel/perl
use strict;
use warnings;
my $input_file = "INPUT.txt";
open ( OUTPUT, ">OUTPUT.txt" );
file_in_directory();
close OUTPUT;
sub file_in_directory {
my $path = "experiment/";
my #unsort_output;
my %not_included; # creating hash map insted of array for cleaner and faster implementaion.
open ( INFILE, "<", $input_file);
while (my $file = <INFILE>) {
if ($file =~ /NOT_INCLUDED/) {
$file =~ s/NOT_INCLUDED=//;
$not_included{$file}++; # create a quick hash map of (filename => 1, filename2 => 1)
}
}
close INFILE;
opendir ( DIR, $path ) || die "Error in opening dir $path\n";
while ( my $filelist = readdir (DIR) ) {
next if $filelist =~ /^\.\.?$/xms; # discard . and .. files
chomp $filelist;
next if ( $filelist =~ m/\.list$/ );
next if ( $filelist =~ m/\.swp$/ );
next if ( $filelist =~ s/\.//g);
if (defined $not_included{$filelist}) {
next;
}
else {
push #unsort_output, $filelist;
}
}
closedir(DIR); # earlier the closedir was inside of while loop. Which is wrong.
my #output = sort #unsort_output;
print OUTPUT join "\n", #output;
}

Not possible to get all content into array

#!/usr/bin/perl
use strict;
use warnings;
my $directory="/var/www/out-original";
my $filterstring=".csv";
my #files;
# Open the folder
opendir my $dir, $directory or die "couldn't open $directory: $!\n";
foreach my $filename (sort(readdir($dir))) {
if ($filename =~ m/$filterstring/) {
# print $filename;
# print "\n";
push (#files, $filename);
}
}
closedir $dir;
foreach my $file (#files) {
open $file or die $!;
my #array;
while ($file) {
#print $_;
# Push each line of the file into array
push (#array, $file);
}
}
Hello everyone,
thanks for helping me so fast with the question about why readdir list filenames in wrong order (Why does readdir() list the filenames in wrong order?).
Now i've got another one. With this snippet it is not possible to get all content of the file into an array?
Error Message:
Can't use string ("Report_01_2014.csv") as a symbol ref while "strict refs" in use at parse.pl line 22.
Thanks in advanced
Chris
I think you want to do
while( my $line = <$file> ){ # get next line, note the '<>'
push ( #array, $line );
}
Also, are you sure open $file or die $! does what you want?
This should be what the error message is telling you, I think.
You pushed a string in ( the filename ).
So I think you want
open my $fh, $file or die $!;
while ( my $line = <$fh> ){...}
close $fh;
Your problem is with open. If you get into the habit of using the three parameter version of open these problems go away.

perl + read multiple csv files + manipulate files + provide output_files

Apologies if this is a bit long winded, bu i really appreciate an answer here as i am having difficulty getting this to work.
Building on from this question here, i have this script that works on a csv file(orig.csv) and provides a csv file that i want(format.csv). What I want is to make this more generic and accept any number of '.csv' files and provide a 'output_csv' for each inputed file. Can anyone help?
#!/usr/bin/perl
use strict;
use warnings;
open my $orig_fh, '<', 'orig.csv' or die $!;
open my $format_fh, '>', 'format.csv' or die $!;
print $format_fh scalar <$orig_fh>; # Copy header line
my %data;
my #labels;
while (<$orig_fh>) {
chomp;
my #fields = split /,/, $_, -1;
my ($label, $max_val) = #fields[1,12];
if ( exists $data{$label} ) {
my $prev_max_val = $data{$label}[12] || 0;
$data{$label} = \#fields if $max_val and $max_val > $prev_max_val;
}
else {
$data{$label} = \#fields;
push #labels, $label;
}
}
for my $label (#labels) {
print $format_fh join(',', #{ $data{$label} }), "\n";
}
i was hoping to use this script from here but am having great difficulty putting the 2 together:
#!/usr/bin/perl
use strict;
use warnings;
#If you want to open a new output file for every input file
#Do it in your loop, not here.
#my $outfile = "KAC.pdb";
#open( my $fh, '>>', $outfile );
opendir( DIR, "/data/tmp" ) or die "$!";
my #files = readdir(DIR);
closedir DIR;
foreach my $file (#files) {
open( FH, "/data/tmp/$file" ) or die "$!";
my $outfile = "output_$file"; #Add a prefix (anything, doesn't have to say 'output')
open(my $fh, '>', $outfile);
while (<FH>) {
my ($line) = $_;
chomp($line);
if ( $line =~ m/KAC 50/ ) {
print $fh $_;
}
}
close($fh);
}
the script reads all the files in the directory and finds the line with this string 'KAC 50' and then appends that line to an output_$file for that inputfile. so there will be 1 output_$file for every inputfile that is read
issues with this script that I have noted and was looking to fix:
- it reads the '.' and '..' files in the directory and produces a
'output_.' and 'output_..' file
- it will also do the same with this script file.
I was also trying to make it dynamic by getting this script to work in any directory it is run in by adding this code:
use Cwd qw();
my $path = Cwd::cwd();
print "$path\n";
and
opendir( DIR, $path ) or die "$!"; # open the current directory
open( FH, "$path/$file" ) or die "$!"; #open the file
**EDIT::I have tried combining the versions but am getting errors.Advise greatly appreciated*
UserName#wabcl13 ~/Perl
$ perl formatfile_QforStackOverflow.pl
Parentheses missing around "my" list at formatfile_QforStackOverflow.pl line 13.
source dir -> /home/UserName/Perl
Can't use string ("/home/UserName/Perl/format_or"...) as a symbol ref while "strict refs" in use at formatfile_QforStackOverflow.pl line 28.
combined code::
use strict;
use warnings;
use autodie; # this is used for the multiple files part...
#START::Getting current working directory
use Cwd qw();
my $source_dir = Cwd::cwd();
#END::Getting current working directory
print "source dir -> $source_dir\n";
my $output_prefix = 'format_';
opendir my $dh, $source_dir; #Changing this to work on current directory; changing back
for my $file (readdir($dh)) {
next if $file !~ /\.csv$/;
next if $file =~ /^\Q$output_prefix\E/;
my $orig_file = "$source_dir/$file";
my $format_file = "$source_dir/$output_prefix$file";
# .... old processing code here ...
## Start:: This part works on one file edited for this script ##
#open my $orig_fh, '<', 'orig.csv' or die $!; #line 14 and 15 above already do this!!
#open my $format_fh, '>', 'format.csv' or die $!;
#print $format_fh scalar <$orig_fh>; # Copy header line #orig needs changeing
print $format_file scalar <$orig_file>; # Copy header line
my %data;
my #labels;
#while (<$orig_fh>) { #orig needs changing
while (<$orig_file>) {
chomp;
my #fields = split /,/, $_, -1;
my ($label, $max_val) = #fields[1,12];
if ( exists $data{$label} ) {
my $prev_max_val = $data{$label}[12] || 0;
$data{$label} = \#fields if $max_val and $max_val > $prev_max_val;
}
else {
$data{$label} = \#fields;
push #labels, $label;
}
}
for my $label (#labels) {
#print $format_fh join(',', #{ $data{$label} }), "\n"; #orig needs changing
print $format_file join(',', #{ $data{$label} }), "\n";
}
## END:: This part works on one file edited for this script ##
}
How do you plan on inputting the list of files to process and their preferred output destination? Maybe just have a fixed directory that you want to process all the cvs files, and prefix the result.
#!/usr/bin/perl
use strict;
use warnings;
use autodie;
my $source_dir = '/some/dir/with/cvs/files';
my $output_prefix = 'format_';
opendir my $dh, $source_dir;
for my $file (readdir($dh)) {
next if $file !~ /\.csv$/;
next if $file =~ /^\Q$output_prefix\E/;
my $orig_file = "$source_dir/$file";
my $format_file = "$source_dir/$output_prefix$file";
.... old processing code here ...
}
Alternatively, you could just have an output directory instead of prefixing the files. Either way, this should get you on your way.

Find a specific word in a file

First, I want to search for a particular file in the directory and then in the file I need to search for a specific word. Here's what I have so far:
$show_bp = 'ShowBuildProcess';
$get_bs = 'GetBuildStatus';
opendir (DIR, $my_dir) or die $!;
while(my $file = readdir(DIR))
{
if($file=~/\.log/)
{
if($file=~/GetBuildStatus/)
{
Filenames will be like GetStatus.<number>.log, e.g. GetStatus.123456.log. I need to:
Find all .log files in the directory
Search for a file with filename starting with GetStatus
Search for filename with the lower numeric part
Search for a particular word in that file
Here is a possible solution for you:
First we look at the file pattern and also extract $1,which is the first regex match )in the brackets). If the file fits we open it and look through it line by line and look for a match to your /YourSearchPattern/:
#!/usr/bin/perl
use warnings;
use strict;
my $mydir = './test/';
opendir (DIR, $mydir) or die $!;
while(my $file = readdir(DIR)){
if ($file =~ /^GetStatus\.(\d+)\.log$/){
if ($1 >= 123456 || $1 < 345678){
open(my $fh,'<', $mydir . $file) or die "Cannot open file $file: $!\n";
while (<$fh>){
if ($_ =~ /YourSearchPattern/){
print $_;
}
}
close($fh);
}
}
}
When you look for the smallest sequence number of the files from your dir you can simply store them in an array and then sort them after those numbers:
...
opendir (DIR, $mydir) or die $!;
my #files;
while(my $file = readdir(DIR)){
if ($file =~ /^GetStatus\.(\d+)\.log$/){
push #files $file;
}
}
my #sortedfiles = sort { my ($anum,$bnum); $a =~ /^GetStatus\.(\d+)\.log$/; $anum = $1; $b =~ /^GetStatus\.(\d+)\.log$/; $bnum = $1; $anum <=> $bnum } #files;
print $sortedfiles[0] . " has the smallest sequence number!\n";