Modify Perl script to run for each file with specified extension in a given directory - perl

I need to modify a Perl script x937.pl to run on all files with extension .x937 within a specific directory. Currently, I use a separate script test.pl that calls my main script, and runs it for each file of that type. However, I need to combine both into one script.
Ideally, I would be able to specify a directory path in the script, and loop through all *.x937 files in that directory.
test.pl
#!/usr/bin/perl -w
use strict;
use Encode;
my #files = <*.x937>;
foreach my $file (#files) {
system('x937.pl', $file);
}
x937.pl
#!/usr/bin/perl -w
use strict;
use Encode;
use warnings;
my $tiff_flag = 0;
my $count = 0;
my $file = "output_$ARGV[0].txt";
unless ( open OPUT, '>' . $file ) {
die "Unable to create $file";
}
open FILE, '<:raw', $ARGV[0] or die "Error opening '$ARGV[0]' $!";
binmode( FILE ) or die 'Error setting binary mode on input file';
while ( read( FILE, $_, 4 ) ) {
my $rec_len = unpack( "N", $_ );
die "Bad record length: $rec_len" unless ( $rec_len > 0 );
read( FILE, $_, $rec_len );
if ( substr( $_, 0, 2 ) eq "\xF5\xF2" ) {
if ( $tiff_flag ) {
$count++;
open( TIFF, '>', 'output_' . $ARGV[0] . '_img' . sprintf( "%04d", $count ) . '.tiff' )
or die "Can't create image file $!";
binmode( TIFF ) or die 'Error setting binary mode on image file';
print TIFF substr( $_, 117 );
close TIFF;
}
$_ = substr( $_, 0, 117 );
}
print OPUT decode( 'cp1047', $_ ) . "\n";
}
close FILE;
close OPUT;

I think I managed to generate this correctly (on iPad, sat on the sofa) ... There could be some typos ; )
Usage: perl test_x397.pl <path>
test_x397.pl
#!/usr/bin/perl -w
use strict; use warnings;
use Encode;
my ($path) = #ARGV;
$path // die "No path specified";
(-e $path) or die "Path not found: $path";
(-d $path) or die "Not a directory: $path";
my #files = <$path/*.x937>;
foreach my $file (#files) {
process($file);
}
sub process {
my ($fname) = #_;
my ($dir, $file) = $fname =~ m{^(.*)/(.+)$};
my $tiff_flag = 0;
my $count = 0;
my $outfile = sprintf("%s/output_%s.txt", $dir, $file);
open (my $outfh, '>', $outfile) or die "Unable to create $outfile. $!";
open (my $infh, '<:raw', $file) or die "Error opening '$file'. $!";
my $buffer = undef;
while (read ($infh,$buffer,4)) {
my $rec_len = unpack("N", $buffer);
die "Bad record length: $rec_len" unless ($rec_len > 0);
read ($infh, $buffer, $rec_len);
if (substr($buffer, 0, 2) eq "\xF5\xF2") {
if ($tiff_flag) {
$count++;
my $tiff_filename = sprintf('%s/output_%s_img%04d.tiff', $dir, $file, $count);
open (my $tiffh, '>', $tiff_filename) or die "Can't create image file $!";
binmode($tiffh) or die 'Error setting binary mode on image file';
print $tiffh substr($buffer, 117);
close $tiffh;
}
$buffer = substr($buffer, 0, 117);
}
print $outfh decode ('cp1047', $buffer) . "\n";
}
close $infh;
close $outfh;
}
A few things to note:
Always use the three argument version of open
Using a scalar filehandle makes it easier to pass it around (not necessary in this example but good practice)
Don't modify $_. It can lead to nasty surprises in larger programs
You already used sprintf to make part of your tiff filename, so why not use it for the whole thing.

Related

How to split a file of 5000 lines in several files of 200 lines each with Perl?

I've a file of 5000 lines and I want several files of 200 lines each, and I tried this:
#!/usr/bin/perl
use strict;
use warnings;
my $targetfile = '200_lines.txt';
my $filename = '5000_lines.txt';
open ( my $DATA, '<', $filename ) or die "Could not open file '$filename': $!";
while ( my $line = <$DATA> ) {
my $counter++;
open (my $FILE, '>>', $targetfile ) or die "Could not open file '$targetfile': $!";
print $FILE $line;
close $FILE;
if ( $counter % 200 == 0
if ($. % 200 == 0) {
$targetfile =~ s/200/$counter/;
}
}
My $counter variable still at 1. I don't know why. So I got just one file named 200_lines.txt with 5000 lines.
You can use sprintf to generate new filenames after every 200 lines and use $. to keep track of line numbers in the file.
The below script will generate smaller chunks of files from the larger file with 200 lines each.
#!/usr/bin/perl
use strict;
use warnings;
open my $rfh,'<','file_5000' or die "unable to open file : $! \n";
my $filecount=0;
my $wfh;
while(<$rfh>){
if(($.-1) % 200 == 0){
close($wfh) if($wfh);
open($wfh, '>', sprintf("file%02d", ++$filecount)) or die $!;
}
print $wfh "$_";
}
close($rfh);
$counter is scoped within your while block. So it is reset each iteration.
You don't really need to do it like that, as you can test $. for the current line number
don't call your filehandle $DATA. There's a special filehandle called <DATA>.
How about:
#!/usr/bin/perl
use strict;
use warnings;
open( my $input, '<', '5000_lines.txt' ) or die $!;
open( my $output, '>', '200_lines.txt' ) or die $!;
while ( my $line = <$input> ) {
unless ( $. % 200 ) {
close($output);
open( $output, '>', int( $. / 200 + 2) . "00_lines.txt" ) or die $!;
}
print {$output} $line;
}
close($input);
close($output);
This creates files:
200_lines.txt
400_lines.txt
600_lines.txt
etc.
You have a bunch of errors in your code.
#!/usr/bin/perl
use strict;
use warnings;
# Creating a file with 5000 lines
my $filename = '5000_lines.txt';
open ( $DATA, '>', $filename ) or die "Could not open file '$filename': $!";
for (my $i=0;$i<5000;$i++){
print $DATA "$i\n";
}
close ( $DATA);
my $targetfile = '200_lines.txt';
open ( my $DATA, '<', $filename ) or die "Could not open file '$filename': $!";
my $counter = 0;
my $num = 0;
my $flag = 1;
while ( my $line = <$DATA> ) {
if ($flag == 1){
open (FILE, '>', $targetfile.'_'.$num ) or die "Could not open file '$targetfile.'_'.$num': $!";
}
print FILE $line;
$flag=0;
if ( $counter % 200 == 0){
$num = $counter/200;
close FILE;
$flag=1
}
$counter++;
}
close (FILE);
It will break the large file into smaller chunks of 200 lines each.

perl script only write one row to output file in perl

i wrote a script to open a file on web, and pull out all rows with wireless in the name. It writes the out put to a different file, but it only records one line in the output file, should be mulitipe lines.
#!\Perl64\eg\perl -w
use warnings;
use strict;
use LWP::Simple;
my $save = "C:\\wireless\\";
my $file = get 'http://dhcp_server.test.com/cgi-bin/dhcp_utilization_csv_region.pl?region=test';
open( FILE, '>', $save . 'DHCP_Utilization_test.csv' ) or die $!;
binmode FILE;
print FILE $file;
close(FILE);
open( F, "C:\\wireless\\DHCP_Utilization_test.csv" ) || die "can't opern file: $!";
my #file = <F>;
close(F);
my $line;
foreach $line (#file) {
chomp $line;
if ( $line =~ m/Wireless /g ) {
my ($ip, $rtr, $mask, $zip, $blc, $address, $city,
$state, $space, $country, $space2, $noc, $company, $extra,
$active, $used, $percent, $extra3, $nus, $construct
) = split( /,/, $line );
my $custom_directory = "C:\\wireless\\";
my $custom_filename = "wireless_DHCP.csv";
my $data = "$ip $mask $rtr $active $used $percent $nus $construct";
my $path = "$custom_directory\\$custom_filename";
open( my $handle, ">>", $path ) || die "can't open $path: $!";
binmode($handle); # for raw; else set the encoding
print $handle "$data\n";
close($handle) || die "can't close $path: $!";
}
}
I believe the problem is because you're on Windows, but then saving the file using :raw, and then reopening it using :crlf.
open( FILE, '>', $save . 'DHCP_Utilization_test.csv' ) or die $!;
binmode FILE;
print FILE $file;
close(FILE);
open( F, "C:\\wireless\\DHCP_Utilization_test.csv" ) || die "can't opern file: $!";
my #file = <F>;
close(F);
I therefore suspect that your #file array only contains one line for the entire file.
You can probably also tighten your code to something like the following:
#!\Perl64\eg\perl
use strict;
use warnings;
use autodie;
use LWP::Simple;
my $url = 'http://dhcp_server.test.com/cgi-bin/dhcp_utilization_csv_region.pl?region=test';
my $datafile = "C:\\wireless\\DHCP_Utilization_test.csv";
my $wireless = "C:\\wireless\\wireless_DHCP.csv";
getstore( $url, $datafile );
open my $infh, '<', $datafile;
open my $outfh, '>>', $wireless;
while (<$infh>) {
chomp;
next unless /Wireless /;
my ($ip, $rtr, $mask, $zip, $blc, $address, $city,
$state, $space, $country, $space2, $noc, $company, $extra,
$active, $used, $percent, $extra3, $nus, $construct
) = split /,/;
print $outfh "$ip $mask $rtr $active $used $percent $nus $construct\n";
}

How to read file in Perl and if it doesn't exist create it?

In Perl, I know this method :
open( my $in, "<", "inputs.txt" );
reads a file but it only does so if the file exists.
Doing the other way, the one with the +:
open( my $in, "+>", "inputs.txt" );
writes a file/truncates if it exists so I don't get the chance to read the file and store it in the program.
How do I read files in Perl considering if the file exists or not?
Okay, I've edited my code but still the file isn't being read. The problem is it doesn't enter the loop. Anything mischievous with my code?
open( my $in, "+>>", "inputs.txt" ) or die "Can't open inputs.txt : $!\n";
while (<$in>) {
print "Here!";
my #subjects = ();
my %information = ();
$information{"name"} = $_;
$information{"studNum"} = <$in>;
$information{"cNum"} = <$in>;
$information{"emailAdd"} = <$in>;
$information{"gwa"} = <$in>;
$information{"subjNum"} = <$in>;
for ( $i = 0; $i < $information{"subjNum"}; $i++ ) {
my %subject = ();
$subject{"courseNum"} = <$in>;
$subject{"courseUnt"} = <$in>;
$subject{"courseGrd"} = <$in>;
push #subjects, \%subject;
}
$information{"subj"} = \#subjects;
push #students, \%information;
}
print "FILE LOADED.\n";
close $in or die "Can't close inputs.txt : $!\n";
Use the proper test file operator:
use strict;
use warnings;
use autodie;
my $filename = 'inputs.txt';
unless(-e $filename) {
#Create the file if it doesn't exist
open my $fc, ">", $filename;
close $fc;
}
# Work with the file
open my $fh, "<", $filename;
while( my $line = <$fh> ) {
#...
}
close $fh;
But if the file is new (without contents), the while loop won't be processed. It's easier to read the file only if the test is fine:
if(-e $filename) {
# Work with the file
open my $fh, "<", $filename;
while( my $line = <$fh> ) {
#...
}
close $fh;
}
You can use +>> for read/append, creates the file if it doesn't exist but doesn't truncate it:
open(my $in,"+>>","inputs.txt");
First check whether the file exists or not. Check the sample code below :
#!/usr/bin/perl
use strict;
use warnings;
my $InputFile = $ARGV[0];
if ( -e $InputFile ) {
print "File Exists!";
open FH, "<$InputFile";
my #Content = <FH>;
open OUT, ">outfile.txt";
print OUT #Content;
close(FH);
close(OUT);
} else {
print "File Do not exists!! Create a new file";
open OUT, ">$InputFile";
print OUT "Hello World";
close(OUT);
}

perl + read multiple csv files + manipulate files + provide output_files

Apologies if this is a bit long winded, bu i really appreciate an answer here as i am having difficulty getting this to work.
Building on from this question here, i have this script that works on a csv file(orig.csv) and provides a csv file that i want(format.csv). What I want is to make this more generic and accept any number of '.csv' files and provide a 'output_csv' for each inputed file. Can anyone help?
#!/usr/bin/perl
use strict;
use warnings;
open my $orig_fh, '<', 'orig.csv' or die $!;
open my $format_fh, '>', 'format.csv' or die $!;
print $format_fh scalar <$orig_fh>; # Copy header line
my %data;
my #labels;
while (<$orig_fh>) {
chomp;
my #fields = split /,/, $_, -1;
my ($label, $max_val) = #fields[1,12];
if ( exists $data{$label} ) {
my $prev_max_val = $data{$label}[12] || 0;
$data{$label} = \#fields if $max_val and $max_val > $prev_max_val;
}
else {
$data{$label} = \#fields;
push #labels, $label;
}
}
for my $label (#labels) {
print $format_fh join(',', #{ $data{$label} }), "\n";
}
i was hoping to use this script from here but am having great difficulty putting the 2 together:
#!/usr/bin/perl
use strict;
use warnings;
#If you want to open a new output file for every input file
#Do it in your loop, not here.
#my $outfile = "KAC.pdb";
#open( my $fh, '>>', $outfile );
opendir( DIR, "/data/tmp" ) or die "$!";
my #files = readdir(DIR);
closedir DIR;
foreach my $file (#files) {
open( FH, "/data/tmp/$file" ) or die "$!";
my $outfile = "output_$file"; #Add a prefix (anything, doesn't have to say 'output')
open(my $fh, '>', $outfile);
while (<FH>) {
my ($line) = $_;
chomp($line);
if ( $line =~ m/KAC 50/ ) {
print $fh $_;
}
}
close($fh);
}
the script reads all the files in the directory and finds the line with this string 'KAC 50' and then appends that line to an output_$file for that inputfile. so there will be 1 output_$file for every inputfile that is read
issues with this script that I have noted and was looking to fix:
- it reads the '.' and '..' files in the directory and produces a
'output_.' and 'output_..' file
- it will also do the same with this script file.
I was also trying to make it dynamic by getting this script to work in any directory it is run in by adding this code:
use Cwd qw();
my $path = Cwd::cwd();
print "$path\n";
and
opendir( DIR, $path ) or die "$!"; # open the current directory
open( FH, "$path/$file" ) or die "$!"; #open the file
**EDIT::I have tried combining the versions but am getting errors.Advise greatly appreciated*
UserName#wabcl13 ~/Perl
$ perl formatfile_QforStackOverflow.pl
Parentheses missing around "my" list at formatfile_QforStackOverflow.pl line 13.
source dir -> /home/UserName/Perl
Can't use string ("/home/UserName/Perl/format_or"...) as a symbol ref while "strict refs" in use at formatfile_QforStackOverflow.pl line 28.
combined code::
use strict;
use warnings;
use autodie; # this is used for the multiple files part...
#START::Getting current working directory
use Cwd qw();
my $source_dir = Cwd::cwd();
#END::Getting current working directory
print "source dir -> $source_dir\n";
my $output_prefix = 'format_';
opendir my $dh, $source_dir; #Changing this to work on current directory; changing back
for my $file (readdir($dh)) {
next if $file !~ /\.csv$/;
next if $file =~ /^\Q$output_prefix\E/;
my $orig_file = "$source_dir/$file";
my $format_file = "$source_dir/$output_prefix$file";
# .... old processing code here ...
## Start:: This part works on one file edited for this script ##
#open my $orig_fh, '<', 'orig.csv' or die $!; #line 14 and 15 above already do this!!
#open my $format_fh, '>', 'format.csv' or die $!;
#print $format_fh scalar <$orig_fh>; # Copy header line #orig needs changeing
print $format_file scalar <$orig_file>; # Copy header line
my %data;
my #labels;
#while (<$orig_fh>) { #orig needs changing
while (<$orig_file>) {
chomp;
my #fields = split /,/, $_, -1;
my ($label, $max_val) = #fields[1,12];
if ( exists $data{$label} ) {
my $prev_max_val = $data{$label}[12] || 0;
$data{$label} = \#fields if $max_val and $max_val > $prev_max_val;
}
else {
$data{$label} = \#fields;
push #labels, $label;
}
}
for my $label (#labels) {
#print $format_fh join(',', #{ $data{$label} }), "\n"; #orig needs changing
print $format_file join(',', #{ $data{$label} }), "\n";
}
## END:: This part works on one file edited for this script ##
}
How do you plan on inputting the list of files to process and their preferred output destination? Maybe just have a fixed directory that you want to process all the cvs files, and prefix the result.
#!/usr/bin/perl
use strict;
use warnings;
use autodie;
my $source_dir = '/some/dir/with/cvs/files';
my $output_prefix = 'format_';
opendir my $dh, $source_dir;
for my $file (readdir($dh)) {
next if $file !~ /\.csv$/;
next if $file =~ /^\Q$output_prefix\E/;
my $orig_file = "$source_dir/$file";
my $format_file = "$source_dir/$output_prefix$file";
.... old processing code here ...
}
Alternatively, you could just have an output directory instead of prefixing the files. Either way, this should get you on your way.

Perl text::csv to check then add field

I am reading an csv file and checking for text within a certain field.
If text exists I would like to keep the line and also add a field.
The following while loop in a program works fine, to save lines with text in col[17]:
while (my $row = $csv->getline($infh)) {
if ($row->[16] ne ""){
my $string = $csv->string;
print {$outfh} $string;
}
}
In another test program the following while loop also works fine, to add a field:
while (my $row = $csv->getline($infh)) {
splice #$row, 1, 0, "1GM";
$csv->print($outfh, $row);
}
I would like to combine these to work in the same program but I am having syntax nitemares.
Can anyone show some pointers to a perl novice?
Here is the full "test program":
#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV;
use Cwd;
use File::Basename;
my $dirname = basename(getcwd);
my $input_file = $dirname . ".txt";
my $output_file = $dirname . '_extract.txt';
my $csv = Text::CSV->new({binary => 1, eol=> "\015\012"})
or die "Cannot use CSV: ".Text::CSV->error_diag ();
open my $infh, $input_file or die "Cannot open '$input_file': $!";
open my $outfh, '>>', $output_file or die "Cannot open '$output_file': $!";
while (my $row = $csv->getline($infh)) {
splice #$row, 1, 0, "1GM";
$csv->print($outfh, $row);
}
close $infh;
close $outfh;
$csv->eof or die "Processing of '$input_file' terminated prematurely";
Added:
I would like the test program to also include the test for the text, somehow using;
if ($row->[16] ne "")
From the other program - So that the extra column is only added to the rows I want to keep.
Cannot figure out how to do this...
Added 2/18: This Worked!, but does not seem very elegant...
#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV;
use Cwd;
use File::Basename;
my $dirname = basename(getcwd);
my $input_file = $dirname . ".txt";
my $output_file = $dirname . '_extract.txt';
my $column = '1GM'; # Text for new column to insert if a match
my $csv = Text::CSV->new({binary => 1, eol=> "\015\012"})
or die "Cannot use CSV: ".Text::CSV->error_diag ();
open my $infh, $input_file or die "Cannot open '$input_file': $!";
open my $outfh, '>>', $output_file or die "Cannot open '$output_file': $!";
while (my $row = $csv->getline($infh)) {
if ($row->[16] ne "") # add/insert column if col[17] has text
{
splice #$row, 1, 0, "$column";
if ($row->[1] eq "$column") # print to file only rows with new column
{
$csv->print($outfh, $row);
}
}
}
close $infh;
close $outfh;
$csv->eof or die "Processing of '$input_file' terminated prematurely";
Added 2/18 5:40pm - Got it..This works!!
while (my $row = $csv->getline($infh)) {
splice #$row, 1, 0, "$column"; # add column
if ($row->[17] ne "") # print if test is true
{
$csv->print($outfh, $row);
}
}