how print out the output in a new file perl - perl

How can I print in a directory the output of the variable $newFile ? How can I use 'cp' to do that ?
After modifications, my code looks like this :
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use File::Copy 'cp';
# binmode(STDOUT, ":utf8") ;
# warn Dumper \#repertoire;
my #rep= glob('/home/test/Bureau/Perl/Test/*'); # output to copy in this dir
foreach my $file (#rep)
{
open(IN, $file) or die "Can't read file '$file' [$!]\n";
while (<IN>)
{
my ($firstCol, $secondCol) = split(/","/, $_);
$firstCol =~ s/http:\/\//_/g;
$secondCol =~ s/\(.+\)/ /ig;
my $LCsecondCol = lc($secondCol);
chomp($secondCol);
chomp($LCsecondCol);
my $newFile = "$firstCol:($secondCol|$LCsecondCol);";
$newFile =~ s/=//g;
print "$newFile\n";
}
close(IN);
}

Your program is a long way off even compiling. You should pay attention to these details
With use strict in place, as it should be, you must declare all of your variables at their point of first use. The variables #files, $file, and $newFile are undeclared so your program won't compile
glob in scalar context returns the next file name that matches the pattern, and is meant for use in a while loop. To get all of the files that match the pattern you should assign to an array, and from the commented-out warn statement it looks like your code used to be that way
You should use lexical file handles and the three-parameter form of open. Well done for checking the status of the open and putting $! in your die string
Your $file =~ ... line looks like it should be a substitution, and the parenthesis at the end should be a semicolon
You have used File::Copy but then use system to copy your files. You should avoid shelling out wherever convenient, and since File::Copy provides a cp function you should use it
Something closer to a working version of your code would look like this
use strict;
use warnings;
use File::Copy 'cp';
while (my $fileName = glob '/home/test/Bureau/Infobox/*.csv') {
my #files = do {
open my $in, '<', $fileName or die "Can't read file '$fileName' [$!]\n";
print "$fileName\n" ;
<$in>;
};
foreach my $file (#files) {
my $newFile = $file =~ s/(\x{0625}\x{0646}\b.+?)\./[[ ]]/gr;
cp $file, $newFile;
}
}

Related

Recovering a specific line in multiple .txt in a directory using Perl

I have the results of a program which gives me the results from some search giving me 2000+ file txt archives. I just need a specific line in each file, this is what I have been trying with Perl:
opendir(DIR, $dirname) or die "Could not open $dirname\n";
while ($filename = readdir(DIR)) {
print "$filename\n";
open ($filename, '<', $filename)or die("Could not open file.");
my $line;
while( <$filename> ) {
if( $. == $27 ) {
print "$line\n";
last;
}
}
}
closedir(DIR);
But there is a problem with the $filename in line 5 and I don't know an alternative to it so I don't have to manually name each file.
Several issues with that code:
Using an old-school bareword identifier for the directory handle instead of a autovivified variable like you are for the file handle.
Using the same variable for the filename and file handle is pretty strange.
You don't check to see if the file is a directory or something else other than a plain file before trying to open it.
$27?
You never assign anything to that $line variable before printing it.
Unless $directory is your program's current working directory, you're running into an issue mentioned in the readdir documentation
If you're planning to filetest the return values out of a readdir, you'd better prepend the directory in question. Otherwise, because we didn't chdir there, it would have been testing the wrong file.
(Substitute open for filetest)
Always use strict; and use warnings;.
Personally, if you just want to print the 27th line of a large number of files, I'd turn to awk and find (Using its -exec test to avoid potential errors about the command line maximum length being hit):
find directory/ -maxdepth 1 -type -f -exec awk 'FNR == 27 { print FILENAME; print }' \{\} \+
If you're on a Windows system without standard unix tools like those installed, or it's part of a bigger program, a fixed up perl way:
#!/usr/bin/env perl
use strict;
use warnings;
use autodie;
use feature qw/say/;
use File::Spec;
my $directory = shift;
opendir(my $dh, $directory);
while (my $filename = readdir $dh) {
my $fullname = File::Spec->catfile($directory, $filename); # Construct a full path to the file
next unless -f $fullname; # Only look at regular files
open my $fh, "<", $fullname;
while (my $line = <$fh>) {
if ($. == 27) {
say $fullname;
print $line;
last;
}
}
close $fh;
}
closedir $dh;
You might also consider using glob to get the filenames instead of opendir/readdir/closedir.
And if you have Path::Tiny available, a simpler version is:
#!/usr/bin/env perl
use strict;
use warnings;
use autodie;
use feature qw/say/;
use Path::Tiny;
my $directory = shift;
my $dir = path $directory;
for my $file ($dir->children) {
next unless -f $file;
my #lines = $file->lines({count => 27});
if (#lines == 27) {
say $file;
print $lines[-1];
}
}

How to open a file that has a special character in it such as $?

Seems fairly simple but with the "$" in the name causes the name to split. I tried escaping the character out but when I try to open the file I get GLOB().
my $path = 'C:\dir\name$.txt';
open my $file, '<', $path || die
print "file = $file\n";
It should open the file so I can traverse the entries.
It has nothing to do with the "$". Just follow standard file handling procedure.
use strict;
use warnings;
my $path = 'C:\dir\name$.txt';
open my $file_handle, '<', $path or die "Can't open $path: $!";
# read and print the file line by line
while (my $line = <$file_handle>) {
# the <> in scalar context gets one line from the file
print $line;
}
# reset the handle
seek $file_handle, 0, 0;
# read the whole file at once, print it
{
# enclose in a block to localize the $/
# $/ is the line separator, so when it's set to undef,
# it reads the whole file
local $/ = undef;
my $file_content = <$file_handle>;
print $file_content;
}
Consider using the CPAN modules File::Slurper or Path::Tiny which will handle the exact details of using open and readline, checking for errors, and encoding if appropriate (most text files are encoded to UTF-8).
use strict;
use warnings;
use File::Slurper 'read_text';
my $file_content = read_text $path;
use Path::Tiny 'path';
my $file_content = path($path)->slurp_utf8;
If it's a data file, use read_binary or slurp_raw.

Tie file not working for loops

I have a script which pulls all the pm files in my directory and look for certain pattern and change them to desired value, i tried Tie::File but it's not looking to content of the file
use File::Find;
use Data::Dumper qw(Dumper);
use Tie::File;
my #content;
find( \&wanted, '/home/idiotonperl/project/');
sub wanted {
push #content, $File::Find::name;
return;
}
my #content1 = grep{$_ =~ /.*.pm/} #content;
#content = #content1;
for my $absolute_path (#content) {
my #array='';
print $absolute_path;
tie #array, 'Tie::File', $absolute_path or die qq{Not working};
print Dumper #array;
foreach my $line(#array) {
$line=~s/PERL/perl/g;
}
untie #array;
}
the output is
Not working at tiereplacer.pl line 22.
/home/idiotonperl/main/content.pm
this is not working as intended(looking into the content of all pm file), if i try to do the same operation for some test file under my home for single file, the content is getting replaced
#content = ‘home/idiotonperl/option.pm’
it’s working as intended
I would not recommend to use tie for that. This simple code below should do as asked
use warnings;
use strict;
use File::Copy qw(move);
use File::Glob ':bsd_glob';
my $dir = '/home/...';
my #pm_files = grep { -f } glob "$dir/*.pm";
foreach my $file (#pm_files)
{
my $outfile = 'new_' . $file; # but better use File::Temp
open my $fh, '<', $file or die "Can't open $file: $!";
open my $fh_out, '>', $outfile or die "Can't open $outfile: $!";
while (my $line = <$fh>)
{
$line =~ s/PERL/perl/g;
print $fh_out $line; # write out the line, changed or not
}
close $fh;
close $fh_out;
# Uncomment after testing, to actually overwrite the original file
#move $outfile, $file or die "Can't move $outfile to $file: $!";
}
The glob from File::Glob allows you to specify filenames similarly as in the shell. See docs for accepted metacharacters. The :bsd_glob is better for treatment of spaces in filenames. †
If you need to process files recursively then you indeed want a module. See File::Find::Rule
The rest of the code does what we must do when changing file content: copy the file. The loop reads each line, changes the ones that match, and writes each line to another file. If the match fails then s/ makes no changes to $line, so we just copy those that are unchanged.
In the end we move that file to overwrite the original using File::Copy.
The new file is temporary and I suggest to create it using File::Temp.
† The glob pattern "$dir/..." allows for an injection bug for directories with particular names. While this is very unusual it is safer to use the escape sequence
my #pm_files = grep { -f } glob "\Q$dir\E/*.pm";
In this case File::Glob isn't needed since \Q escapes spaces as well.
Solution using my favorite module: Path::Tiny. Unfortunately, it isn't a core module.
use strict;
use warnings;
use Path::Tiny;
my $iter = path('/some/path')->iterator({recurse => 1});
while( my $p = $iter->() ) {
next unless $p->is_file && $p =~ /\.pm\z/i;
$p->edit_lines(sub {
s/PERL/perl/;
#add more line-editing
});
#also check the path(...)->edit(...) as an alternative
}
Working fine for me:
#!/usr/bin/env perl
use common::sense;
use File::Find;
use Tie::File;
my #content;
find(\&wanted, '/home/mishkin/test/t/');
sub wanted {
push #content, $File::Find::name;
return;
}
#content = grep{$_ =~ /.*\.pm$/} #content;
for my $absolute_path (#content) {
my #array='';
say $absolute_path;
tie #array, 'Tie::File', $absolute_path or die "Not working: $!";
for my $line (#array) {
$line =~ s/PERL/perl/g;
}
untie #array;
}

In Perl, how can filter all log files in a directory, and extract interesting lines?

I'm trying to select only the .log files in my directory and then search in those files for the word "unbound" and print the entire line into a new output file with the same name as the log file (number###.log) but with a .txt extension. This is what I have so far:
#!/usr/bin/perl
use strict;
use warnings;
my $path = $ARGV[0];
my $outpath = $ARGV[1];
my #files;
my $files;
opendir(DIR,$path) or die "$!";
#files = grep { /\.log$/} readdir(DIR);
my #out;
my $out;
opendir(OUT,$outpath) or die "$!";
my $line;
foreach $files (#files) {
open (FILE, "$files");
my #line = <FILE>;
my $regex = Unbound;
open (OUT, ">>$out");
print grep {$line =~ /$regex/ } <>;
}
close OUT;
close FILE;
closedir(DIR);
closedir (OUT);
I'm a beginner, and I don't really know how to create a new text file with the acquired output.
Few things I'd suggest to improve this code:
declare your loop iterators within the loop. foreach my $file ( #files ) {
use 3 arg open: open ( my $input_fh, "<", $filename );
use glob rather than opendir then grep. foreach my $file ( <$path/*.txt> ) {
grep is good for extracting things into arrays. Your grep reads the whole file to print it, which isn't necessary. Doesn't matter much if the file is short though.
perltidy is great for reformatting code.
you're opening 'OUT' to a directory path (I think?) which isn't going to work.
$outpath isn't, it's a file. You need to do something different to output to different files. opendir isn't really valid to an output.
because you're using opendir that's actually giving you filenames - not full paths. So you might be in the wrong place to actually open the files. Prepending the path name, doing a chdir are possible solutions. But that's one of the reasons I like glob because it returns a path as well.
So with that in mind - how about:
#!/usr/bin/perl
use strict;
use warnings;
use File::Basename;
#Extract paths
my $input_path = $ARGV[0];
my $output_path = $ARGV[1];
#Error if paths are invalid.
unless (defined $input_path
and -d $input_path
and defined $output_path
and -d $output_path )
{
die "Usage: $0 <input_path> <output_path>\n";
}
foreach my $filename (<$input_path/*.log>) {
# extract the 'name' bit of the filename.
# be slightly careful with this - it's based
# on an assumption which isn't always true.
# File::Spec is a more powerful way of accomplishing this.
# but should grab 'number####' from /path/to/file/number####.log
my $output_file = basename ( $filename, '.log' );
#open input and output filehandles.
open( my $input_fh, "<", $filename ) or die $!;
open( my $output_fh, ">", "$output_path/$output_file.txt" ) or die $!;
print "Processing $filename -> $output_path/$output_file.txt\n";
#iterate input, extracting into $line
while ( my $line = <$input_fh> ) {
#check if $line matches your RE.
if ( $line =~ m/Unbound/ ) {
#write it to output.
print {$output_fh} $line;
}
}
#tidy up our filehandles. Although technically, they'll
#close automatically because they leave scope
close($output_fh);
close($input_fh);
}
Here is a script that takes advantage of Path::Tiny. Now, at this stage of your learning process, you are probably better off understanding #Sobrique's solution, but using modules such as Path::Tiny or Path::Class will make it easier to write these one off scripts more quickly, and correctly.
Also, I didn't really test this script, so watch out for bugs.
#!/usr/bin/env perl
use strict;
use warnings;
use Path::Tiny;
run(\#ARGV);
sub run {
my $argv = shift;
unless (#$argv == 2) {
die "Need source and destination paths\n";
}
my $it = path($argv->[0])->realpath->iterator({
recurse => 0,
follow_symlinks => 0,
});
my $outdir = path($argv->[1])->realpath;
while (my $path = $it->()) {
next unless -f $path;
next unless $path =~ /[.]log\z/;
my $logfh = $path->openr;
my $outfile = $outdir->child($path->basename('.log') . '.txt');
my $outfh;
while (my $line = <$logfh>) {
next unless $line =~ /Unbound/;
unless ($outfh) {
$outfh = $outfile->openw;
}
print $outfh $line;
}
close $outfh
or die "Cannot close output '$outfile': $!";
}
}
Notes
realpath will croak if the path provided does not exist.
Similarly for openr and openw.
I am reading input files line-by-line to keep the memory footprint of the program independent of the sizes of input files.
I do not open the output file until I know I have a match to print to.
When matching a file extension using a regular expression pattern, keep in mind that \n is a valid character in Unix file names, and the $ anchor will match it.

Read directory and save contents of files to new file in Perl

I want to save the contents of some files into a new file, and I do the following:
use strict;
use warnings;
use HTML::TreeBuilder::XPath;
my ($dir) = #ARGV;
my #files = glob "details/*";
my $filename = 'target.txt';
for my $file (#files) {
my $tree = HTML::TreeBuilder::XPath->new_from_file($file);
my #opacity = $tree->findnodes_as_strings('//div[#class="opacity description"]');
open my $fh, '>>', $filename;
print $fh for #opacity;
}
Unfortunately it will not work. And I dont understand why?
Check the return value of open:
open my $fh ">>", $filename or die "Can't open $filename: $!";
This can provide invaluable insights when “something isn't working”.
Your syntax for print is ambiguous. Use print or say like
print FILEHANDLE LIST
print {EXPR} LIST # EXPR has to produce a filehandle object
print LIST # prints to the `select`ed filehandle, usually STDOUT
print # prints $_ by default
So you want to explicitly specify what you are printing, and probably also add a newline after each element in #opacity. So either
print {$fh} "$_\n" for #opacity;
or use feature 'say' (perl 5.10 and better):
say {$fh} $_ for #opacity;