Dynamic Perl find and replace using grep inside backticks - perl

I am trying to do a dynamic search and replace with Perl on the command line with part of the replacement text being the output of a grep command within backticks. Is this possible to do on the command line, or will I need to write a script to do this?
Here is the command that I thought would do the trick. I thought that Perl would treat the backticks as a command substitution, but instead it just treats the backticks and the content within them as a string:
perl -p -i -e 's/example.xml/http:\/\/exampleURL.net\/`grep -ril "example_needle" *`\/example\/path/g' `grep -ril "example_needle" *`
UPDATE:
Thanks for the helpful answers. Yes, there was a typo in my original one-liner: the target file of grep is supposed to be *.
I wrote a small script based on Schewrn's example, but am having confusing results. Here is the script I wrote:
#!/usr/bin/env perl -p -i
my $URL_First = "http://examplesite.net/some/path/";
my $URL_Last = "/example/example.xml";
my #files = `grep -ril $URL_Last .`;
chomp #files;
foreach my $val (#files) {
#dir_names = split('/',$val);
if(#dir_names[1] ne $0) {
my $url = $URL_First . #dir_names[1] . $URL_Last;
open INPUT, "+<$val" or die $!;
seek INPUT,0,0;
while(<INPUT>) {
$_ =~ s{\Q$URL_Last}{$url}g;
print INPUT $_;
}
close INPUT;
}
}
Basically what I am trying to do is:
Find files that contain $URL_Last.
Replace $URL_Last with $URL_First plus the name of the directory that the matched file is in, plus $URL_Last.
Write the above change to the input file without modifying anything else in the input file.
After running my script, it completely garbled the HTML code in the input file and it cut off the first few characters of each line in the file. This is strange, because I know for sure that $URL_Last only occurs once in each file, so it should only be matched once and replaced once. Is this being caused by a misuse of the seek function?

You should use another delimiter for s/// so that you don't need to escape slashes in the URL:
perl -p -i -e '
s#example.xml#http://exampleURL.net/`grep -ril "example_needle"`/example/path#g'
`grep -ril "example_needle" *`
Your grep command inside the regex will not be executed, as it is just a string, and backticks are not meta characters. Text inside a substitution will act as though it was inside a double quoted string. You'd need the /e flag to execute the shell command:
perl -p -i -e '
s#example.xml#
qq(http://exampleURL.net/) . `grep -ril "example_needle"` . qq(/example/path)
#ge'
`grep -ril "example_needle" *`
However, what exactly are you expecting that grep command to do? It lacks a target file. -l will print file names for matching files, and grep without a target file will use stdin, which I suspect will not work.
If it is a typo, and you meant to use the same grep as for your argument list, why not use #ARGV?
perl -p -i -e '
s#example.xml#http://exampleURL.net/#ARGV/example/path#g'
`grep -ril "example_needle" *`
This may or may not do what you expect, depending on whether you expect to have newlines in the string. I am not sure that argument list will be considered a list or a string.

It seems like what you're trying to do is...
Find a file in a tree which contains a given string.
Use that file to build a URL.
Replace something in a string with that URL.
You have three parts, and you could jam them together into one regex, but it's much easier to do it in three steps. You won't hate yourself in a week when you need to add to it.
The first step is to get the filenames.
# grep -r needs a directory to search, even if it's just the current one
my #files = `grep -ril $search .`;
# strip the newlines off the filenames
chomp #files;
Then you need to decide what to do if you get more than one file from grep. I'll leave that choice up to you, I'm just going to take the first one.
my $file = $files[0];
Then build the URL. Easy enough...
# Put it in a variable so it can be configured
my $Site_URL = "http://www.example.com/";
my $url = $Site_URL . $file;
To do anything more complicated, you'd use URI.
Now the search and replace is trivial.
# The \Q means meta-characters like . are ignored. Better than
# remembering to escape them all.
$whatever =~ s{\Qexample.xml}{$url}g;
You want to edit files using -p and -i. Fortunately we can emulate that functionality.
#!/usr/bin/env perl
use strict;
use warnings; # never do without these
my $Site_URL = "http://www.example.com/";
my $Search = "example-search";
my $To_Replace = "example.xml";
# Set $^I to edit files. With no argument, just show the output
# script.pl .bak # saves backup with ".bak" extension
$^I = shift;
my #files = `grep -ril $Search .`;
chomp #files;
my $file = $files[0];
my $url = $Site_URL . $file;
#ARGV = ($files[0]); # set the file up for editing
while (<>) {
s{\Q$To_Replace}{$url}g;
}

Everyone's answers were very helpful to my writing a script that wound up working for me. I actually found a bash script solution yesterday, but wanted to post a Perl answer in case anyone else finds this question through Google.
The script that #TLP posted at http://codepad.org/BFpIwVtz is an alternative way of doing this.
Here is what I ended up writing:
#!/usr/bin/perl
use Tie::File;
my $URL_First = 'http://example.com/foo/bar/';
my $Search = 'path/example.xml';
my $URL_Last = '/path/example.xml';
# This grep returns a list of files containing "path/example.xml"
my #files = `grep -ril $Search .`;
chomp #files;
foreach my $File_To_Edit (#files) {
# The output of $File_To_Edit looks like this: "./some_path/index.html"
# I only need the "some_path" part, so I'm going to split up the output and only use #output[1] ("some_path")
#output = split('/',$File_To_Edit);
# "some_path" is the parent directory of "index.html", so I'll call this "$Parent_Dir"
my $Parent_Dir = #output[1];
# Make sure that we don't edit the contents of this script by checking that $Parent_Dir doesn't equal our script's file name.
if($Parent_Dir ne $0) {
# The $File_To_Edit is "./some_path/index.html"
tie #lines, 'Tie::File', $File_To_Edit or die "Can't read file: $!\n";
foreach(#lines) {
# Finally replace "path/example.xml" with "http://example.com/foo/bar/some_path/path/example.xml" in the $File_To_Edit
s{$Search}{$URL_First$Parent_Dir$URL_Last}g;
}
untie #lines;
}
}

Related

Executing grep via Perl

I am new to Perl. I am trying to execute grep command with perl.
I have to read input from a file and based on the input, the grep has to be executed.
My code is as follows:
#!/usr/bin/perl
use warnings;
use strict;
#Reading input files line by line
open FILE, "input.txt" or die $!;
my $lineno = 1;
while (<FILE>) {
print " $_";
#This is what expected.
#our $result=`grep -r Unable Satheesh > out.txt`;
our $result=`grep -r $_ Satheesh > out.txt`;
print $result
}
print "************************************************************\n";
But, if I run the script, it looks like a infinite loop and script is keep on waiting and nothing is printed in the out.txt file.
The reason it's hanging is because you forgot to use chomp after reading from FILE. So there's a newline at the end of $_, and it's executing two shell commands:
grep -r $_
Satheesh > out.txt
Since there's no filename argument to grep, it's reading from standard input, i.e. the terminal. If you type Ctl-d when it hangs, you'll then get an error message telling you that there's no Satheesh command.
Also, since you're redirecting the output of grep to out.txt, nothing gets put in $result. If you want to capture the output in a variable and also put it into the file, you can use the tee command.
Here's the fix:
while (<FILE>) {
print " $_";
chomp;
#This is what expected.
#our $result=`grep -r Unable Satheesh > out.txt`;
our $result=`grep -r $_ Satheesh | tee out.txt`;
print $result
}

Perl script in bash's HereDoc

Is possible somewhat write a perl script in a bash script as heredoc?
This is not working (example only)
#/bin/bash
perl <<EOF
while(<>) {
chomp;
print "xxx: $_\n";
}
EOF
Is here some nice way how to embed a perl script into a bash script? Want run perl script from an bash script and don't want put it into external file.
The problem here is that the script is being passed to perl on stdin, so trying to process stdin from the script doesn't work.
1. String literal
perl -e '
while(<>) {
chomp;
print "xxx: $_\n";
}
'
Using a string literal is the most direct way to write this, though it's not ideal if the Perl script contains single quotes itself.
2. Use perl -e
#/bin/bash
script=$(cat <<'EOF'
while(<>) {
chomp;
print "xxx: $_\n";
}
EOF
)
perl -e "$script"
If you pass the script to perl using perl -e then you won't have the stdin problem and you can use any characters you like in the script. It's a bit roundabout to do this, though. Heredocs yield input on stdin and we need strings. What to do? Oh, I know! This calls for $(cat <<HEREDOC).
Make sure to use <<'EOF' rather than just <<EOF to keep bash from doing variable interpolation inside the heredoc.
You could also write this without the $script variable, although it's getting awfully hairy now!
perl -e "$(cat <<'EOF'
while(<>) {
chomp;
print "xxx: $_\n";
}
EOF
)"
3. Process substitution
perl <(cat <<'EOF'
while(<>) {
chomp;
print "xxx: $_\n";
}
EOF
)
Along the lines of #2, you can use a bash feature called process substitution which lets you write <(cmd) in place of a file name. If you use this you don't need the -e since you're now passing perl a file name rather than a string.
You know I never thought of this.
The answer is "YES!" it does work. As others have mentioned, <STDIN> can't be used, but this worked fine:
$ perl <<'EOF'
print "This is a test\n";
for $i ( (1..3) ) {
print "The count is $i\n";
}
print "End of my program\n";
EOF
This is a test
The count is 1
The count is 2
The count is 3
End of my program
In Kornshell and in BASH, if you surround your end of here document string with single quotes, the here document isn't interpolated by the shell.
Only small corection of #John Kugelman's answer. You can eliminate the useless cat and use:
read -r -d '' perlscript <<'EOF'
while(<>) {
chomp;
print "xxx: $_\n";
}
EOF
perl -e "$perlscript"
Here's another way to use a PERL HEREDOC script within bash, and take full advantage it.
#!/bin/sh
#If you are not passing bash var's and single quote the HEREDOC tag
perl -le "$(cat <<'MYPL'
# Best to build your out vars rather than writing directly
# to the pipe until the end.
my $STDERRdata="", $STDOUTdata="";
while ($i=<STDIN>){ chomp $i;
$STDOUTdata .= "To stdout\n";
$STDERRdata .= "Write from within the heredoc\n";
MYPL
print $STDOUTdata; #Doing the pipe write at the end
warn $STDERRdata; #will save you a lot of frustration.
)" [optional args] <myInputFile 1>prints.txt 2>warns.txt
or
#!/bin/sh
set WRITEWHAT="bash vars"
#If you want to include your bash var's
#Escape the $'s that are not bash vars, and double quote the HEREDOC tag
perl -le "$(cat <<"MYPL"
my $STDERRdata="", $STDOUTdata="";
while (\$i=<STDIN>){ chomp \$i;
\$STDOUTdata .= "To stdout\n";
\$STDERRdata .= "Write $WRITEWHAT from within the heredoc\n";
MYPL
print \$STDOUTdata; #Doing the pipe write at the end
warn \$STDERRdata; #will save you a lot of frustration.
)" [optional args] <myInputFile 1>prints.txt 2>warns.txt

File comparison with multiple columns

I am doing a directory cleanup to check for files that are not being used in our testing environment. I have a list of all the file names which are sorted alphabetically in a text file and another file I want to compare against.
Here is how the first file is setup:
test1.pl
test2.pl
test3.pl
It is a simple, one script name per line text file of all the scripts in the directory I want to clean up based on the other file below.
The file I want to compare against is a tab file which lists a script that each server runs as a test and there are obviously many duplicates. I want to strip out the testing script names from this file and compare spit it out to another file, use uniq and sort so that I can diff this file with the above to see which testing scripts are not being used.
The file is setup as such:
server: : test1.pl test2.pl test3.pl test4.sh test5.sh
There are some lines with less and some with more. My first impulse was to make a Perl script to split the line and push the values in an list if they are not there but that seems wholly inefficient. I am not to experienced in awk but I figured there is more than one way to do it. Any other ideas to compare these files?
A Perl solution that makes a %needed hash of the files being used by the servers and then checks against the file containing all the file names.
#!/usr/bin/perl
use strict;
use warnings;
use Inline::Files;
my %needed;
while (<SERVTEST>) {
chomp;
my (undef, #files) = split /\t/;
#needed{ #files } = (1) x #files;
}
while (<TESTFILES>) {
chomp;
if (not $needed{$_}) {
print "Not needed: $_\n";
}
}
__TESTFILES__
test1.pl
test2.pl
test3.pl
test4.pl
test5.pl
__SERVTEST__
server1:: test1.pl test3.pl
server2:: test2.pl test3.pl
__END__
*** prints
C:\Old_Data\perlp>perl t7.pl
Not needed: test4.pl
Not needed: test5.pl
This rearranges filenames to be one per line in second file via awk, then diff the output with the first file.
diff file1 <(awk '{ for (i=3; i<=NF; i++) print $i }' file2 | sort -u)
Quick and dirty script to do the job. If it sounds good, use open to read the files with proper error checking.
use strict;
use warnings;
my #server_lines = `cat server_file`;chomp(#server_lines);
my #test_file_lines = `cat test_file_lines`;chomp(#test_file_lines);
foreach my $server_line (#server_lines){
$server_line =~ s!server: : !!is;
my #files_to_check = split(/\s+/is, $server_line);
foreach my $file_to_check (#files_to_check){
my #found = grep { /$file_to_check/ } #test_file_lines;
if (scalar(#found)==0){
print "$file_to_check is not found in $server_line\n";
}
}
}
If I understand your need correctly you have a file with a list of tests (testfiles.txt):
test1.pl
test2.pl
test3.pl
test4.pl
test5.pl
And a file with a list of servers, with files they all test (serverlist.txt):
server1: : test1.pl test3.pl
server2: : test2.pl test3.pl
(Where I have assumed all spaces as tabs).
If you convert the second file into a list of tested files, you can then compare this using diff to your original file.
cut -d: -f3 serverlist.txt | sed -e 's/^\t//g' | tr '\t' '\n' | sort -u > tested_files.txt
The cut removes the server name and ':', the sed removes the leading tab left behind, tr then converts the remaining tabs into newlines, then we do a unique sort to sort and remove duplicates. This is output to tested_files.txt.
Then all you do is diff testfiles.txt tested_files.txt.
It's hard to tell since you didn't post the expected output but is this what you're looking for?
$ cat file1
test1.pl
test2.pl
test3.pl
$
$ cat file2
server: : test1.pl test2.pl test3.pl test4.sh test5.sh
$
$ gawk -v RS='[[:space:]]+' 'NR==FNR{f[$0]++;next} FNR>2 && !f[$0]' file1 file2
test4.sh
test5.sh

Perl - One liner file edit: "perl -n -i.bak -e "print unless /^$id$,/" $filetoopena;" Not working

I cannot get this to work.
#!/usr/bin/perl -w
use strict;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);
my $id='123456';
my $filetoopen = '/home/user/public/somefile.txt';
file contains:
123456
234564
364899
437373
So...
A bunch of other subs and code
if(-s $filetoopen){
perl -n -i.bak -e "print unless /^$id$,/" $filetoopen;
}
I need to remove the line that matches $id from file $filetoopen
But, I don't want script to "crash" if $id is not in $filetoopen either.
This is in a .pl scripts sub, not being run from command line.
I think I am close but, after reading for hours here, I had to resort to posting the question.
Will this even work in a script?
I tried TIE with success but, I need to know alternatively how to do this without TIE::FILE.
When I tried I got the error:
syntax error at mylearningcurve.pl line 456, near "bak -e "
Thanks for teaching this old dog...
First of all (this is not the cause of your problem) $, (aka $OUTPUT_FIELD_SEPARATOR) defaults to undef, I'm not sure why you are using it in the regex. I have a feeling the comma was a typo.
It's unclear if you are calling this from a shell script or from Perl?
If from Perl, you should not call a nested Perl interpreter at all.
If the file is small, slurp it in and print:
use File::Slurp;
my #lines = read_file($filename);
write_file($filename, grep { ! /^$id$/ } #lines);
If the file is large, read line by line as a filter.
use File::Copy;
move($filename, "$filename.old") or die "Can not rename: $!\n";
open(my $fh_old, "<", "$filename.old") or die "Can not open $filename.old: $!\n";
open(my $fh, ">", $filename) or die "Can not open $filename: $!\n";
while my $line (<$fh_old>) {
next if $line =~ /^id$/;
print $fh $_;
}
close($fh_old);
close($fh);
If from a shell script, this worked for me:
$ cat x1
123456
234564
364899
437373
$ perl -n -i.bak -e "print unless /^$id$/" x1
$ cat x1
234564
364899
437373
if(-s $filetoopen){
perl -n -i.bak -e "print unless /^$id$,/" $filetoopen;
}
I'm not at all sure what you expect this to do. You can't just put a command line program in the middle of Perl code. You need to use system to call an external program. And Perl is just an external program like any other.
if(-s $filetoopen){
system('perl', '-n -i.bak -e "print unless /^$id$,/"', $filetoopen);
}
The functionality of the -i command line argument can be accessed via $^I.
local #ARGV = $filetoopen;
local $^I = '.bak';
local $_;
while (<>) {
print if !/^$id$/;
}

How can I grep for a value from a shell variable?

I've been trying to grep an exact shell 'variable' using word boundaries,
grep "\<$variable\>" file.txt
but haven't managed to; I've tried everything else but haven't succeeded.
Actually I'm invoking grep from a Perl script:
$attrval=`/usr/bin/grep "\<$_[0]\>" $upgradetmpdir/fullConfiguration.txt`
$_[0] and $upgradetmpdir/fullConfiguration.txt contains some matching "text".
But $attrval is empty after the operation.
#OP, you should do that 'grepping' in Perl. don't call system commands unnecessarily unless there is no choice.
$mysearch="pattern";
while (<>){
chomp;
#s = split /\s+/;
foreach my $line (#s){
if ($line eq $mysearch){
print "found: $line\n";
}
}
}
I'm not seeing the problem here:
file.txt:
hello
hi
anotherline
Now,
mala#human ~ $ export GREPVAR="hi"
mala#human ~ $ echo $GREPVAR
hi
mala#human ~ $ grep "\<$GREPVAR\>" file.txt
hi
What exactly isn't working for you?
Not every grep supports the ex(1) / vi(1) word boundary syntax.
I think I would just do:
grep -w "$variable" ...
Using single quotes works for me in tcsh:
grep '<$variable>' file.txt
I am assuming your input file contains the literal string: <$variable>
If variable=foo are you trying to grep for "foo"? If so, it works for me. If you're trying to grep for the variable named "$variable", then change the quotes to single quotes.
On a recent linux it works as expected. Do could try egrep instead
Say you have
$ cat file.txt
This line has $variable
DO NOT PRINT ME! $variableNope
$variable also
Then with the following program
#! /usr/bin/perl -l
use warnings;
use strict;
system("grep", "-P", '\$variable\b', "file.txt") == 0
or warn "$0: grep exited " . ($? >> 8);
you'd get output of
This line has $variable
$variable also
It uses the -P switch to GNU grep that matches Perl regular expressions. The feature is still experimental, so proceed with care.
Also note the use of system LIST that bypasses shell quoting, allowing the program to specify arguments with Perl's quoting rules rather than the shell's.
You could use the -w (or --word-regexp) switch, as in
system("grep", "-w", '\$variable', "file.txt") == 0
or warn "$0: grep exited " . ($? >> 8);
to get the same result.
Using single quote it wont work. You should go for double quote
For example:
this wont work
--------------
for i in 1
do
grep '$i' file
done
this will work
--------------
for i in 1
do
grep "$i" file
done