I am trying to copy the content of three separate .vect files into one. I want to do this for all 5,000 files in the $fromdir directory.
When I run this program it generates just a single modified .vect file in the output directory. If I include the close(DATA) calls after individual while loops inside the foreach loop, I get the same behavior: a single output file in the output directory instead of the wanted 5,000 files.
I have done some reading, and at first thought I may not be opening the files. But if I print($vectfile) in the foreach loop every file name in the directory is printed.
My second thought was that it was how I was closing the files, but
I get the same behavior whether
I close the file handles inside or outside the foreach loop.
My final thought was maybe I don't have write permission to the file or directory, but I don't know how to change this.
How can I get this loop to run all 5,000 times and not just once?
use strict;
use warnings;
use feature qw(say);
my $dir = "D:\\Downloads";
# And M3.1 and P3.1
my $subfolder = "A0.1";
my $fromdir = $dir . "\\" . $subfolder;
my #files = <$fromdir/*vect>;
# Top of file
my $readfiletop = "C:\\Users\\Owner\\Documents\\MoreKnotVis\\ScriptsForAdditionalDataSets\\VectFileHeader.vect";
# Bottom of file
my $readfilebottom = "C:\\Users\\Owner\\Documents\\MoreKnotVis\\ScriptsForAdditionalDataSets\\VectFileCloser.vect";
foreach my $vectfile ( #files ) {
say("$vectfile");
my $count = 0;
my $readfilebody = $vectfile;
my $out_file = "D:\\Downloads\\ColorsA0.1\\" . "$count" . ".vect";
$count++;
# open top part of each file
open(DATA1, "<", $readfiletop) or die "Can't open '$readfiletop': $!";
# open bottom part of each file
open(DATA3, "<", $readfilebottom) or die "Can't open '$readfilebottom': $!";
# open a file to read
open(DATA2, "<", $vectfile) or die "Can't open '$vectfile': $!";
# open a file to write to
open(DATA4, ">" ,$out_file) or die "Can't open '$out_file': $!";
# Copy data from VectFileTop file to another.
while ( <DATA1> ) {
print DATA4 $_;
}
# Copy the data from VectFileBody to another.
while ( <DATA2> ) {
print DATA4 $_, $_ if 8..12;
}
# Copy the data from VectFileBottom to another.
while ( <DATA3> ) {
print DATA4 $_;
}
}
close( DATA1 );
close( DATA2 );
close( DATA3 );
close( DATA4 );
print("quit\n");
You construct the output file name including $count in it.
But note what you do with this variable:
initially, but inside the loop you set it to 0,
the output file name is constructed with 0 in it,
then you increment it, but this has no effect, because this variable
is again set to 0 in the next execution of the loop..
The effect is that:
the loop executes the required numer of times,
but the output file name every time contains 0 as the "number",
so you keep overwriting the same file with a new content.
Move my $count = 0; instruction before the loop and everything
should be OK.
You seem to be clinging to a specific form of code in fear of everything falling apart if you change a single thing. I recommend that you dare to stray a little more from the formula so that the code is more concise and readable
The problem is that you reset your $count to zero before processing each input file, so all the output files have the same name and overwrite one another. The remaining output file contains only the data from the last input file
Here's a refactoring of your code. I can't guarantee that it will run correctly but it looks right and does compile
I've added use autodie to avoid having to check the status of every IO operation
I've used the same lexical file handle $fh for all the input file. Opening another file on a file handle that is already open will close it first, and a lexical file handle will be closed by perl when it goes out of scope at the end of the block
I've used a while loop to iterate over the input file names instead of reading the whole list into an array which unnecessarily uses an additional variable #files and wastes space
I've used forward slashes instead of backslashes in all the file paths. This is fine in library calls on Windows: it is only a problem if they appear in command line input
I hope you'll agree that this form is more readable. I think you would have stood a much better chance of finding the problem if your code were in this form
use strict;
use warnings;
use autodie;
use feature qw/ say /;
my $indir = 'D:/Downloads';
my $subdir = 'A0.1'; # And M3.1 and P3.1
my $extrasdir = 'C:/Users/Owner/Documents/MoreKnotVis/ScriptsForAdditionalDataSets';
my $outdir = "$indir/Colors$subdir";
my $topfile = "$extrasdir/VectFileHeader.vect";
my $bottomfile = "$extrasdir/VectFileCloser.vect";
my $filenum;
while ( my $vectfile = glob "$indir/$subdir/*.vect" ) {
say qq/Processing "$vectfile"/;
$filenum++;
open my $outfh, '>', "$outdir/$filenum.vect";
my $fh;
open $fh, '<', $topfile;
print { $outfh } $_ while <$fh>;
open $fh, '<', $vectfile;
while ( <$fh> ) {
print { $outfh } $_, $_ if 8..12;
}
open $fh, '<', $bottomfile;
print { $outfh } $_ while <$fh>;
}
say 'DONE';
Related
I am new to perl. I have a directory structure. In each directory, I have a log file. I want to grep pattern from that file and do post processing. Right now I am grepping the pattern from those files using unix grep and putting into text file and reading that text file to do post processing, But I want to automate task of reading each file and grepping pattern from that file. In the code below the mdp_cgdis_1102.txt have grepped pattern from directories. I would really appreciate any help
#!usr/bin/perl
use strict;
use warnings;
open FILE, 'mdp_cgdis_1102.txt' or die "Cannot open file $!";
my #array = <FILE>;
my #arr;
my #brr;
foreach my $i (#array){
#arr = split (/\//, $i);
#brr = split (/\:/, $i);
print " $arr[0] --- $brr[2]";
}
It is unclear to me which part of the process needs automating. I'll go by "want to automate reading each file and grepping pattern from that file," whereby you presumably already have a list of files. If you actually need to build the file list as well see the added code below.
One way: pull all patterns from each file and store that in a hash (filename => arrayref-with-patterns)
my %file_pattern;
foreach my $file (#filelist) {
open my $fh, '<', $file or die "Can't open $file: $!";
$file_pattern{$file} = [ grep { /$pattern/ } <$fh> ];
close $fh;
}
The [ ] takes a reference to the list returned by grep, ie. constructs an "anonymous array", and that (reference) is assigned as a value to the $file key.
Now you can process your patterns, per log file
foreach my $filename (sort keys %file_pattern) {
print "Processing log $filename.\n";
my #patterns = #{$file_pattern{$filename}};
# Process the list of patterns in this log file
}
ADDED
In order to build the list of files #filelist used above, from a known list of directories, use core File::Find
module which recursively scans supplied directories and applies supplied subroutines
use File::Find;
find( { wanted => \&process_logs, preprocess => \&select_logs }, #dir_list);
Your subroutine process_logs() is applied to each file/directory that passed preprocessing by the second sub, with its name available as $File::Find::name, and in it you can either populate the hash with patterns-per-log as shown above, or run complete processing as needed.
Your subroutine select_logs() contains code to filter log files from all files in each directory, that File::Find would normally processes, so that process_file() only gets the log files.
Another way would be to use the other invocation
find(\&process_all, #dir_list);
where now the sub process_all() is applied to all entries (files and directories) found and thus this sub itself needs to ensure that it only processes the log files. See linked documentation.
The equivalent of
find ... -name '*.txt' -type f -exec grep ... {} +
is
use File::Find::Rule qw( );
my $base_dir_qfn = ...;
my $re = qr/.../;
my #log_qfns =
File::Find::Rule
->name(qr/\..txt\z/)
->file
->in($base_dir_qfn);
my $success = 1;
for my $log_qfn (#log_qfns) {
open(my $fh, '<', $log_qfn)
or do {
$success = 0;
warn("Can't open log file \"$log_qfn\": $!\n);
next;
};
while (<$fh>) {
print if /$re/;
}
}
exit(1) if !$success;
Use File::Find to traverse the directory.
In a loop go through all the logfiles:
Open the file
read it line by line
For each line, do a regular expression match (
if ($line =~ /pattern/) ) or use
if (index($line, $searchterm) >= 0) if you are looking for a certain static string.
If you find a match, print the line.
close the file
I hope that gives you enough pointers to get started. You will learn more if you find out how to do each of these steps in Perl by yourself (I pointed out the hard ones).
Below I'v proided just a chunk of a huge perl script I am trying to write. I am getting syntax errors in else statement but in the console window its only saying syntax error at perl script and not clearly telling the error. I am trying to create a variable file file_no_$i.txt and copy contents of t_code.txt in it and then find and replace string in the variable file with some selected keys of hash %defines_2
open ( my $pointer, "<", "t_code.txt" ) or die $!;
my $out_pointer;
for (my $i=0 ; $i <=$#match ; $i++) {
for (my $j=0; $j <= $#match ; $j++) {
if ($match[$i]=~$match[$j]) {
next;
}
else {
my $file_name = "file_no_$i.txt";
open $out_pointer, ">" , $file_name or die "Can't open the output file!";
copy("$file_name","t_code.txt") or die "Copy failed: $!";
my #lin = <$out_pointer>;
foreach $_(#lin) {
$_ =~ s/UART90_BASE_ADDRESS/$defines_2{ $_ = grep{/$match[$i]/} (keys %defines_2)};
}
}
}
}
You cannot use / unquoted inside a s/// construct. Instead of backslashes, you can use different delimiters:
s#UART90_BASE_ADDRESS#$defines_2{ $_ = grep{/$match[$i]/} (keys %defines_2)}#;
It fixes the syntax error, but I fear it still won't do what you want. Without data, it's hard to test, though.
What I think you're doing is editing a number of text files whose names look like file_no_1.txt etc. You're doing that by copying the current file to t_code.txt and then reading that file line by line, editing as required, as writing the lines back to the original text file.
The problem with that approach is that the file will be copied and rewritten many times, and it would be better to read the whole file into an array, make all the edits, and then write them back in one operation. That would be fine unless the file is enormous — say, several GB.
Here's some code that implements that approach. You see that $file_name is defined and #lines is filled outside the inner loop. The innermost loop modifies the elements of #lines and, outside that loop again, #lines is written back to the original text file.
I couldn't fathom a couple of things about your code.
I'm not sure if you should be using =~ or if you intended a simple eq. The former does a contains test, and you had a problem in the past where you meant to check that the first string had the second at the end
The grep call
grep{/$match[$i]/} (keys %defines_2)
worries me, as it can potentially return more than one key of the %defines_2 hash, in which case your own code will insert what is pretty much a random selection from the hash elements
If your code is working then that's fine, but if not then I hope this helps you fix it. If you need more help on this chunk of code then you should include a small sample of the data so that we can better understand what is going on.
for my $i (0 .. $#match) {
my $file_name = "file_no_$i.txt";
my #lines = do {
open my $in_fh, '<', 't_code.txt' or die $!;
<$in_fh>;
};
for my $j (0 .. $#match) {
next if $match[$i] =~ $match[$j];
for ( #lines ) {
my ($match) = grep { /$match[$i]/ } keys %defines_2;
s/UART90_BASE_ADDRESS/$defines_2{$match}/;
}
}
open my $out_fh, '>', $file_name or die qq{Can't open "$file_name" for output: $!};
print $out_fh $_ for #lines;
close $out_fh or die qq{Failed to close output file "$file_name": $!};
}
I have two files
first:
8237764738;00:78:9E:EE:CA:6F;FTTH;MULTI
8237764738;2C:39:96:52:47:82;FTTH;MULTI
0415535921;E8:BE:81:86:F1:6F;FTTH;MULTI
0415535921;2C:39:96:5B:12:C6;EZ;SINGLE
...etc
second:
00:78:9E:EE:CA:6F;2013/10/28 13:37:50
E8:BE:81:86:F1:6F;2013/11/05 13:38:30
00:78:9E:EC:4A:B0;2013/10/28 13:59:16
2C:E4:12:AA:F7:95;2013/10/31 13:57:55
...etc
and I have to take mac_address (second position) from the first file and find it in the second one
and append (if match) to first file the date at end from the second file.
output:
8237764738;00:78:9E:EE:CA:6F;FTTH;MULTI;2013/10/28 13:37:50
0415535921;E8:BE:81:86:F1:6F;FTTH;MULTI;2013/11/05 13:38:30
I write a simple script to find the mac_address
but I don't know how to put in the script to add the date.
my %iptv;
my #result;
open IN, "/home/terminals.csv";
while (<IN>) {
chomp;
#wynik = split(/;/,$_);
$iptv{$result[1]} = $result[0];
}
close IN;
open IN, "/home/reboots.csv";
open OUT, ">/home/out.csv";
while (<IN>) {
chomp;
my ($mac, $date) = split(/;/,$_);
if (defined $iptv{$mac})
{
print OUT "$date,$mac \n";
}
}
close IN;
close OUT;
Assuming that the first file lists each MAC number once and that you want an output line for each time the MAC appears in the second file, then:
#!/usr/bin/env perl
use strict;
use warnings;
die "Usage: $0 terminals reboots\n" unless scalar(#ARGV) == 2;
my %iptv;
open my $in1, '<', $ARGV[0] or die "Failed to open file $ARGV[0] for reading";
while (<$in1>)
{
chomp;
my #result = split(/;/, $_); # Fix array used here
$iptv{$result[1]} = $_; # Fix what's stored here
}
close $in1;
open my $in2, '<', $ARGV[1] or die "Failed to open file $ARGV[1] for reading";
while (<$in2>)
{
chomp;
my ($mac, $date) = split(/;/,$_);
print "$iptv{$mac};$date\n" if (defined $iptv{$mac});
}
close $in2;
This uses two file names on the command line and writes to standard output; it is a more general purpose program than your original. It also gets me around the problem that I don't have a /home directory.
For your sample inputs, the output is:
8237764738;00:78:9E:EE:CA:6F;FTTH;MULTI;2013/10/28 13:37:50
0415535921;E8:BE:81:86:F1:6F;FTTH;MULTI;2013/11/05 13:38:30
You were actually fairly close to this, but were making some silly little mistakes.
In your code, you either aren't showing everything or you aren't using:
use strict;
use warnings;
Perl experts use both to make sure they don't make silly mistakes; beginners should do so too. It would have pointed out that #wynik was not declared with my and was assigned to but not used, for example. You could have meant to write #result = split...;. You were not saving the correct data; you were not writing out the information from the $iptv{$mac} that you needed to.
I'm using this code I found online to read a properties file in my Perl script:
open (CONFIG, "myfile.properties");
while (CONFIG){
chomp; #no new line
s/#.*//; #no comments
s/^\s+//; #no leading white space
s/\s+$//; #no trailing white space
next unless length;
my ($var, $value) = split (/\s* = \s*/, $_, 2);
$$var = $value;
}
Is it posssible to also write to the text file inside this while loop? Let's say the text file looks like this:
#Some comments
a_variale = 5
a_path = /home/user/path
write_to_this_variable = ""
How can I put some text in write_to_this_variable?
It is not really practical to overwrite text files where you have variable length records (lines). It is normal to copy the file, something like this:
my $filename = 'myfile.properites';
open(my $in, '<', $filename) or die "Unable to open '$filename' for read: $!";
my $newfile = "$filename.new";
open(my $out, '>', $newfile) or die "Unable to open '$newfile' for write: $!";
while (<$in>) {
s/(write_to_this_variable =) ""/$1 "some text"/;
print $out;
}
close $in;
close $out;
rename $newfile,$filename or die "unable to rename '$newfile' to '$filename': $!";
You might have to sanitse the text you are writing with something like \Q if it contains non-alphanumerics.
This is an example of a program that uses the Config::Std module to read an write a simple config file like yours. As far as I know it is the only module that will preserve any comments in the original file.
There are two points to note:
The first hash key in $props{''}{write_to_this_variable} forms the name of the config file section that will contain the value. If there are no sections, as for your file, then you must use an empty string here
If you need quotes around the a value then you must add these explicitly when you are assigning to the hash element, as I do here with '"Some text"'
I think the rest of the program is self-explanatory.
use strict;
use warnings;
use Config::Std { def_sep => ' = ' };
my %props;
read_config 'myfile.properties', %props;
$props{''}{write_to_this_variable} = '"Some text"';
write_config %props;
output
#Some comments
a_variale = 5
a_path = /home/user/path
write_to_this_variable = "Some text"
I'm trying to wrap my head around IPC::Run to be able to do the following. For a list of files:
my #list = ('/my/file1.gz','/my/file2.gz','/my/file3.gz');
I want to execute a program that has built-in decompression, does some editing and filtering to them, and prints to stdout, giving some stats to stderr:
~/myprogram options $file
I want to append the stdout of the execution for all the files in the list to one single $out file, and be able to parse and store a couple of lines in each stderr as variables, while letting the rest be written out into separate fileN.log files for each input file.
I want stdout to all go into a ">>$all_into_one_single_out_file", it's the err that I want to keep in different logs.
After reading the manual, I've gone so far as to the code below, where the commented part I don't know how to do:
for $file in #list {
my #cmd;
push #cmd, "~/myprogram options $file";
IPC::Run::run \#cmd, \undef, ">>$out",
sub {
my $foo .= $_[0];
#check if I want to keep my line, save value to $mylog1 or $mylog2
#let $foo and all the other lines be written into $file.log
};
}
Any ideas?
First things first. my $foo .= $_[0] is not necessary. $foo is a new (empty) value, so appending to it via .= doesn't do anything. What you really want is a simple my ($foo) = #_;.
Next, you want to have output go to one specific file for each command while also (depending on some conditional) putting that same output to a common file.
Perl (among other languages) has a great facility to help in problems like this, and it is called closure. Whichever variables are in scope at the time of a subroutine definition, those variables are available for you to use.
use strict;
use warnings;
use IPC::Run qw(run new_chunker);
my #list = qw( /my/file1 /my/file2 /my/file3 );
open my $shared_fh, '>', '/my/all-stdout-goes-here' or die;
open my $log1_fh, '>', '/my/log1' or die "Cannot open /my/log1: $!\n";
open my $log2_fh, '>', '/my/log2' or die "Cannot open /my/log2: $!\n";
foreach my $file ( #list ) {
my #cmd = ( "~/myprogram", option1, option2, ..., $file );
open my $log_fh, '>', "$file.log"
or die "Cannot open $file.log: $!\n";
run \#cmd, '>', $shared_fh,
'2>', new_chunker, sub {
# $out contains each line of stderr from the command
my ($out) = #_;
if ( $out =~ /something interesting/ ) {
print $log1_fh $out;
}
if ( $out =~ /something else interesting/ ) {
print $log2_fh $out;
}
print $log_fh $out;
return 1;
};
}
Each of the output file handles will get closed when they're no longer referenced by anything -- in this case at the end of this snippet.
I fixed your #cmd, though I don't know what your option1, option2, ... will be.
I also changed the way you are calling run. You can call it with a simple > to tell it the next thing is for output, and the new_chunker (from IPC::Run) will break your output into one-line-at-a-time instead of getting all the output all-at-once.
I also skipped over the fact that you're outputting to .gz files. If you want to write to compressed files, instead of opening as:
open my $fh, '>', $file or die "Cannot open $file: $!\n";
Just open up:
open my $fh, '|-', "gzip -c > $file" or die "Cannot startup gzip: $!\n";
Be careful here as this is a good place for command injection (e.g. let $file be /dev/null; /sbin/reboot. How to handle this is given in many, many other places and is beyond the scope of what you're actually asking.
EDIT: re-read problem a bit more, and changed answer to more closely reflect the actual problem.
EDIT2:: Updated per your comment. All stdout goes to one file, and the stderr from command is fed to the inline subroutine. Also fixed a stupid typo (for syntax was pseudo code not Perl).