Perl search file recursive for string and replace - perl

in terminal with perl how can I search all php files starting recursive from current working directory for a single or multiline pattern like:
<script>var a=''; * hamoorabi.com * </script>
Read like: find all between <script>var a=''; and </script> but only if contains hamoorabi.com and replace it with an empty string (remove it).
As it´s javascript code there can be a bunch of unescaped characters inside the search string.

From a unix or cwygin prompt:
$ find . | grep .php | xargs ./xx1.pl
Where perl script xx1.pl is :
#!/usr/bin/perl
use strict;
use warnings;
undef $/;
for (#ARGV) {
open(FILE,$_);
my $content = <FILE>;
close(FILE);
my $beginning = '<script>var a=\'\'';
my $end = '</script>';
my $containing = 'hamoorabi.com';
#$content =~ s/(<script>.*?)(hamoorabi.com)(.*?<\/script>)/$1$3/sg;
#much better regex provided by ysth
$content =~ s/\Q$beginning\E(?:(?!\Q$end\E).)*\Q$containing\E.*?\Q$end\E//gs;
open(FILE,">$_");
print FILE $content;
close(FILE);
}

Use File::Find to walk a directory tree looking for files.
use strict;
use warnings;
use File::Find;
sub wanted {
# Ignore anything that isn't a file
return unless -f;
# Ignore anything without a .php extension
return unless /\.php$/;
# Your filename is in $_. Your current directory is the
# one which contains the current file.
# Do what you need to do.
open my $fh, '<', $_ or die $!;
...;
}

I hope this is also helps use further more.
use strict;
use warnings;
String replacement Regex forms
my $str = "<script>var a=''\; * hamoorabi.com * </script>";
$str=~s#<script[^>]*>((?:(?!<\/script>).)*)<\/script># my $script=$&;
$script=~s/^(.*)hamoorabi\.com(.*)$/$1$2/g;
($script);
#esg;
print $str;
Replace the content on the same file using Tie::File
my #array;
use Tie::File;
tie #array, 'Tie::File', "Input.php" || die "blabla";
my $len = join "\n", #array;
#array = split/\n/, $len;
untie #array;

Related

how to assign data into hash from an input file

I am new to perl.
Inside my input file is :
james1
84012345
aaron5
2332111 42332
2345112 18238
wayne[2]
3505554
Question: I am not sure what is the correct way to get the input and set the name as key and number as values. example "james" is key and "84012345" is the value.
This is my code:
#!/usr/bin/perl -w
use strict;
use warnings;
use Data::Dumper;
my $input= $ARGV[0];
my %hash;
open my $data , '<', $input or die " cannot open file : $_\n";
my #names = split ' ', $data;
my #values = split ' ', $data;
#hash{#names} = #values;
print Dumper \%hash;
I'mma go over your code real quick:
#!/usr/bin/perl -w
-w is not recommended. You should use warnings; instead (which you're already doing, so just remove -w).
use strict;
use warnings;
Very good.
use Data::Dumper;
my $input= $ARGV[0];
OK.
my %hash;
Don't declare variables before you need them. Declare them in the smallest scope possible, usually right before their first use.
open my $data , '<', $input or die " cannot open file : $_\n";
You have a spurious space at the beginning of your error message and $_ is unset at this point. You should include $input (the name of the file that failed to open) and $! (the error reason) instead.
my #names = split ' ', $data;
my #values = split ' ', $data;
Well, this doesn't make sense. $data is a filehandle, not a string. Even if it were a string, this code would assign the same list to both #names and #values.
#hash{#names} = #values;
print Dumper \%hash;
My version (untested):
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
#ARGV == 1
or die "Usage: $0 FILE\n";
my $file = $ARGV[0];
my %hash;
{
open my $fh, '<', $file or die "$0: can't open $file: $!\n";
local $/ = '';
while (my $paragraph = readline $fh) {
my #words = split ' ', $paragraph;
my $key = shift #words;
$hash{$key} = \#words;
}
}
print Dumper \%hash;
The idea is to set $/ (the input record separator) to "" for the duration of the input loop, which makes readline return whole paragraphs, not lines.
The first (whitespace separated) word of each paragraph is taken to be the key; the remaining words are the values.
You have opened a file with open() and attached the file handle to $data. The regular way of reading data from a file is to loop over each line, like so:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my $input = $ARGV[0];
my %hash;
open my $data , '<', $input or die " cannot open file : $_\n";
while (my $line = <$data>) {
chomp $line; # Removes extra newlines (\n)
if ($line) { # Checks if line is empty
my ($key, $value) = split ' ', $line;
$hash{$key} = $value;
}
}
print Dumper \%hash;
OK, +1 for using strict and warnings.
First Take a look at the $/ variable for controlling how a file is broken into records when it's read in.
$data is a file handle you need to extract the data from the file, if it's not to big you can load it all into an array, if it's a large file you can loop over each record at a time. See the <> operator in perlop
Looking at you code it appears that you want to end up with the following data structure from your input file
%hash(
james1 =>[
84012345
],
aaron5 => [
2332111,
42332,
2345112,
18238
]
'wayne[2]' => [
3505554,
]
)
See perldsc on how to do that.
All the documentation can be read using the perldoc command which comes with Perl. Running perldoc on its own will give you some tips on how to use it and running perldoc perldoc will give you possibly far more info than you need at the moment.

Tie file not working for loops

I have a script which pulls all the pm files in my directory and look for certain pattern and change them to desired value, i tried Tie::File but it's not looking to content of the file
use File::Find;
use Data::Dumper qw(Dumper);
use Tie::File;
my #content;
find( \&wanted, '/home/idiotonperl/project/');
sub wanted {
push #content, $File::Find::name;
return;
}
my #content1 = grep{$_ =~ /.*.pm/} #content;
#content = #content1;
for my $absolute_path (#content) {
my #array='';
print $absolute_path;
tie #array, 'Tie::File', $absolute_path or die qq{Not working};
print Dumper #array;
foreach my $line(#array) {
$line=~s/PERL/perl/g;
}
untie #array;
}
the output is
Not working at tiereplacer.pl line 22.
/home/idiotonperl/main/content.pm
this is not working as intended(looking into the content of all pm file), if i try to do the same operation for some test file under my home for single file, the content is getting replaced
#content = ‘home/idiotonperl/option.pm’
it’s working as intended
I would not recommend to use tie for that. This simple code below should do as asked
use warnings;
use strict;
use File::Copy qw(move);
use File::Glob ':bsd_glob';
my $dir = '/home/...';
my #pm_files = grep { -f } glob "$dir/*.pm";
foreach my $file (#pm_files)
{
my $outfile = 'new_' . $file; # but better use File::Temp
open my $fh, '<', $file or die "Can't open $file: $!";
open my $fh_out, '>', $outfile or die "Can't open $outfile: $!";
while (my $line = <$fh>)
{
$line =~ s/PERL/perl/g;
print $fh_out $line; # write out the line, changed or not
}
close $fh;
close $fh_out;
# Uncomment after testing, to actually overwrite the original file
#move $outfile, $file or die "Can't move $outfile to $file: $!";
}
The glob from File::Glob allows you to specify filenames similarly as in the shell. See docs for accepted metacharacters. The :bsd_glob is better for treatment of spaces in filenames. †
If you need to process files recursively then you indeed want a module. See File::Find::Rule
The rest of the code does what we must do when changing file content: copy the file. The loop reads each line, changes the ones that match, and writes each line to another file. If the match fails then s/ makes no changes to $line, so we just copy those that are unchanged.
In the end we move that file to overwrite the original using File::Copy.
The new file is temporary and I suggest to create it using File::Temp.
† The glob pattern "$dir/..." allows for an injection bug for directories with particular names. While this is very unusual it is safer to use the escape sequence
my #pm_files = grep { -f } glob "\Q$dir\E/*.pm";
In this case File::Glob isn't needed since \Q escapes spaces as well.
Solution using my favorite module: Path::Tiny. Unfortunately, it isn't a core module.
use strict;
use warnings;
use Path::Tiny;
my $iter = path('/some/path')->iterator({recurse => 1});
while( my $p = $iter->() ) {
next unless $p->is_file && $p =~ /\.pm\z/i;
$p->edit_lines(sub {
s/PERL/perl/;
#add more line-editing
});
#also check the path(...)->edit(...) as an alternative
}
Working fine for me:
#!/usr/bin/env perl
use common::sense;
use File::Find;
use Tie::File;
my #content;
find(\&wanted, '/home/mishkin/test/t/');
sub wanted {
push #content, $File::Find::name;
return;
}
#content = grep{$_ =~ /.*\.pm$/} #content;
for my $absolute_path (#content) {
my #array='';
say $absolute_path;
tie #array, 'Tie::File', $absolute_path or die "Not working: $!";
for my $line (#array) {
$line =~ s/PERL/perl/g;
}
untie #array;
}

I have a file that I want to split using pipe as delimiter. How can I read the file using Perl?

Here is a shell script reading the file.
#!/bin/sh
procDate=$1
echo "Date $procDate"
file=`cat filename_$procDate.txt`
echo "$file"
I want to convert it to Perl and use the split operator with pipe | as delimiter.
It's far from clear from your question what it is that you want to do with these fields once you have split them
Your own shell script uses cat to copy the entire contents of your file into $file, but that's unlikely to be what you need to do
A very generalised Perl program would look like this
use strict;
use warnings 'all';
my ($procDate) = #ARGV;
print "Date $procDate\n";
open my $fh, '<', "filename_$procDate.txt" or die $!;
while ( <$fh> ) {
chomp;
my #fields = split /\|/;
# do something with #fields, for instance
print "#fields\n";
}
That code splits each line on pipe | characters, puts the list of substrings in #fields and then prints it separated by spaces. But I can't guess what more you might want to do?
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
open(FILE, "<filename_$procDate.txt") or die "Couldn't open file filename_$procDate.txt, $!";
while ( my $line = <FILE> ) {
print "Line content is $line\n";
my #line_content = split(/\|/, $line);
print Dumper (\#line_content);
}
close (FILE);

Perl search for first occurrence of pattern in directory

I have a directory with a list of image header files of the format
image1.hd
image2.hd
image3.hd
image4.hd
I want to search for the regular expression Image type:=4 in the directory and find the file number which has the first occurrence of this pattern. I can do this with a couple of pipes easily in bash:
grep -l 'Image type:=4' image*.hd | sed ' s/.*image\(.*\).hd/\1/' | head -n1
which returns 1 in this case.
This pattern match will be used in a perl script. I know I could use
my $number = `grep -l 'Image type:=4' image*.hd | sed ' s/.*image\(.*\).hd/\1/' | head -n1`
but is it preferable to use pure perl in such cases? Here is the best I could come up with using perl. It is very cumbersome:
my $tmp;
#want to find the planar study in current study
foreach (glob "$DIR/image*.hd"){
$tmp = $_;
open FILE, "<", "$_" or die $!;
while (<FILE>)
{
if (/Image type:=4/){
$tmp =~ s/.*image(\d+).hd/$1/;
}
}
close FILE;
last;
}
print "$tmp\n";
this also returns the desired output of 1. Is there a more effective way of doing this?
This is simple with the help of a couple of utility modules
use strict;
use warnings;
use File::Slurp 'read_file';
use List::MoreUtils 'firstval';
print firstval { read_file($_) =~ /Image type:=4/ } glob "$DIR/image*.hd";
But if you are restricted to core Perl, then this will do what you want
use strict;
use warnings;
my $firstfile;
while (my $file = glob 'E:\Perl\source\*.pl') {
open my $fh, '<', $file or die $!;
local $/;
if ( <$fh> =~ /Image type:=4/) {
$firstfile = $file;
last;
}
}
print $firstfile // 'undef';

How to pass and read a File handle to a Perl subroutine?

I want a Perl module that reads from the special file handle, <STDIN>, and passes this to a subroutine. You will understand what I mean when you see my code.
Here is how it was before:
#!/usr/bin/perl
use strict; use warnings;
use lib '/usr/local/custom_pm'
package Read_FH
sub read_file {
my ($filein) = #_;
open FILEIN, $filein or die "could not open $filein for read\n";
# reads each line of the file text one by one
while(<FILEIN>){
# do something
}
close FILEIN;
Right now the subroutine takes a file name (stored in $filein) as an argument, opens the file with a file handle, and reads each line of the file one by one using the fine handle.
Instead, I want get the file name from <STDIN>, store it inside a variable, then pass this variable into a subroutine as an argument.
From the main program:
$file = <STDIN>;
$variable = read_file($file);
The subroutine for the module is below:
#!/usr/bin/perl
use strict; use warnings;
use lib '/usr/local/custom_pm'
package Read_FH
# subroutine that parses the file
sub read_file {
my ($file)= #_;
# !!! Should I open $file here with a file handle? !!!!
# read each line of the file
while($file){
# do something
}
Does anyone know how I can do this? I appreciate any suggestions.
It is a good idea in general to use lexical filehandlers. That is a lexical variable containing the file handler instead of a bareword.
You can pass it around like any other variables. If you use read_file from File::Slurp you do not need a seperate file handler, it slurps the content into a variable.
As it is also good practice to close opened file handles as soon as possible this should be the preferred way if you realy only need to get the complete file content.
With File::Slurp:
use strict;
use warnings;
use autodie;
use File::Slurp;
sub my_slurp {
my ($fname) = #_;
my $content = read_file($fname);
print $content; # or do something else with $content
return 1;
}
my $filename = <STDIN>;
my_slurp($filename);
exit 0;
Without extra modules:
use strict;
use warnings;
use autodie;
sub my_handle {
my ($handle) = #_;
my $content = '';
## slurp mode
{
local $/;
$content = <$handle>
}
## or line wise
#while (my $line = <$handle>){
# $content .= $line;
#}
print $content; # or do something else with $content
return 1;
}
my $filename = <STDIN>;
open my $fh, '<', $filename;
my_handle($fh); # pass the handle around
close $fh;
exit 0;
I agree with #mugen kenichi, his solution is a better way to do it than building your own. It's often a good idea to use stuff the community has tested. Anyway, here are the changes you can make to your own program to make it do what you want.
#/usr/bin/perl
use strict; use warnings;
package Read_FH;
sub read_file {
my $filein = <STDIN>;
chomp $filein; # Remove the newline at the end
open my $fh, '<', $filein or die "could not open $filein for read\n";
# reads each line of the file text one by one
my $content = '';
while (<$fh>) {
# do something
$content .= $_;
}
close $fh;
return $content;
}
# This part only for illustration
package main;
print Read_FH::read_file();
If I run it, it looks like this:
simbabque#geektour:~/scratch$ cat test
this is a
testfile
with blank lines.
simbabque#geektour:~/scratch$ perl test.pl
test
this is a
testfile
with blank lines.