Grep to match all lines of patternfile (perl -e ok too) - perl

I'm looking for a simple/elegant way to grep a file such that every returned line must match every line of a pattern file.
With input file
acb
bc
ca
bac
And pattern file
a
b
c
The command should return
acb
bac
I tried to do this with grep -f but that returns if it matches a single pattern in the file (and not all). I also tried something with a recursive call to perl -ne (foreach line of the pattern file, call perl -ne on the search file and try to grep in place) but I couldn't get the syntax parser to accept a call to perl from perl, so not sure if that's possible.
I thought there's probably a more elegant way to do this, so I thought I'd check. Thanks!
===UPDATE===
Thanks for your answers so far, sorry if I wasn't clear but I was hoping for just a one-line result (creating a script for this seems too heavy, just wanted something quick). I've been thinking about it some more and I came up with this so far:
perl -n -e 'chomp($_); print " | grep $_ "' pattern | xargs echo "cat input"
which prints
cat input | grep a | grep b | grep c
This string is what I want to execute, I just need to somehow execute it now. I tried an additional pipe to eval
perl -n -e 'chomp($_); print " | grep $_ "' pattern | xargs echo "cat input" | eval
Though that gives the message:
xargs: echo: terminated by signal 13
I'm not sure what that means?

One way using perl:
Content of input:
acb
bc
ca
bac
Content of pattern:
a
b
c
Content of script.pl:
use warnings;
use strict;
## Check arguments.
die qq[Usage: perl $0 <input-file> <pattern-file>\n] unless #ARGV == 2;
## Open files.
open my $pattern_fh, qq[<], pop #ARGV or die qq[ERROR: Cannot open pattern file: $!\n];
open my $input_fh, qq[<], pop #ARGV or die qq[ERROR: Cannot open input file: $!\n];
## Variable to save the regular expression.
my $str;
## Read patterns to match, and create a regex, with each string in a positive
## look-ahead.
while ( <$pattern_fh> ) {
chomp;
$str .= qq[(?=.*$_)];
}
my $regex = qr/$str/;
## Read each line of data and test if the regex matches.
while ( <$input_fh> ) {
chomp;
printf qq[%s\n], $_ if m/$regex/o;
}
Run it like:
perl script.pl input pattern
With following output:
acb
bac

Using Perl, I suggest you read all the patterns into an array and compile them. Then you can read through your input file using grep to make sure all of the regexes match.
The code looks like this
use strict;
use warnings;
open my $ptn, '<', 'pattern.txt' or die $!;
my #patterns = map { chomp(my $re = $_); qr/$re/; } grep /\S/, <$ptn>;
open my $in, '<', 'input.txt' or die $!;
while (my $line = <$in>) {
print $line unless grep { $line !~ $_ } #patterns;
}
output
acb
bac

Another way is to read all the input lines and then start filtering by each pattern:
#!/usr/bin/perl
use strict;
use warnings;
open my $in, '<', 'input.txt' or die $!;
my #matches = <$in>;
close $in;
open my $ptn, '<', 'pattern.txt' or die $!;
for my $pattern (<$ptn>) {
chomp($pattern);
#matches = grep(/$pattern/, #matches);
}
close $ptn;
print #matches;
output
acb
bac

Not grep and not a one liner...
MFILE=file.txt
PFILE=patterns
i=0
while read line; do
let i++
pattern=$(head -$i $PFILE | tail -1)
if [[ $line =~ $pattern ]]; then
echo $line
fi
# (or use sed instead of bash regex:
# echo $line | sed -n "/$pattern/p"
done < $MFILE

A bash(Linux) based solution
#!/bin/sh
INPUTFILE=input.txt #Your input file
PATTERNFILE=patterns.txt # file with patterns
# replace new line with '|' using awk
PATTERN=`awk 'NR==1{x=$0;next}NF{x=x"|"$0}END{print x}' "$PATTERNFILE"`
PATTERNCOUNT=`wc -l <"$PATTERNFILE"`
# build regex of style :(a|b|c){3,}
PATTERN="($PATTERN){$PATTERNCOUNT,}"
egrep "${PATTERN}" "${INPUTFILE}"

Here's a grep-only solution:
#!/bin/sh
foo ()
{
FIRST=1
cat pattern.txt | while read line; do
if [ $FIRST -eq 1 ]; then
FIRST=0
echo -n "grep \"$line\""
else
echo -n "$STRING | grep \"$line\""
fi
done
}
STRING=`foo`
eval "cat input.txt | $STRING"

Related

Getting error while replacing word using perl

I am writing a script for replacing 2 words from a text file. The script is
count=1
for f in *.pdf
do
filename="$(basename $f)"
filename="${filename%.*}"
filename="${filename//_/ }"
echo $filename
echo $f
perl -pe 's/intime_mean_pu.pdf/'$f'/' fig.tex > fig_$count.tex
perl -pi 's/TitleFrame/'$filename'/' fig_$count.tex
sed -i '/Pointer-rk/r fig_'$count'.tex' $1.tex
count=$((count+1))
done
But the replacing of words using the second perl command is giving error:
Can't open perl script "s/TitleFrame/Masses1/": No such file or directory
Please suggest what I am doing wrong.
You could change your script to something like this:
#!/bin/bash
for f in *.pdf; do
filename=$(basename "$f" .pdf)
filename=${filename//_/}
perl -spe 's/intime_mean_pu.pdf/$a/;
s/TitleFrame/$b/' < fig.tex -- -a="$f" -b="$filename" > "fig_$count.tex"
sed -i "/Pointer-rk/r fig_$count.tex" "$1.tex"
((++count))
done
As well as some other minor changes to your script, I have made use of the -s switch to Perl, which means that you can pass arguments to the one-liner. The bash variables have been double quoted to avoid problems with spaces in filenames, etc.
Alternatively, you could do the whole thing in Perl:
#!/usr/bin/env perl
use strict;
use warnings;
use autodie;
use File::Basename;
my $file_arg = shift;
my $count = 1;
for my $f (glob "*.pdf") {
my $name = fileparse($f, qq(.pdf));
open my $in, "<", $file_arg;
open my $out, ">", 'tmp';
open my $fig, "<", 'fig.tex';
# copy up to match
while (<$in>) {
print $out $_;
last if /Pointer-rk/;
}
# insert contents of figure (with substitutions)
while (<$fig>) {
s/intime_mean_pu.pdf/$f/;
s/TitleFrame/$name/;
print $out $_;
}
# copy rest of file
print $out $_ while <$in>;
rename 'tmp', $file_arg;
++$count;
}
Use the script like perl script.pl "$1.tex".
You're missing the -e in the second perl call

Perl script: validity of a directory always return false

I want to read some parameters from a file using perl script.
I used grep command to find the values of thoses parameters.
#!/usr/bin/perl
if( scalar #ARGV ==0)
{
die "Number of argument is zero. Please use: perl <script_name> <cdc68.ini> \n";
}
if( !-f $ARGV[0]) ## if the 1st arg is not a file
{
die "$ARGV[0] is not a valid file type \nInput Arguments not correct!!! \n";
}
my $file_cnf=$ARGV[0];
my $DEST_PATH=`grep relogin_logs $file_cnf | cut -d "=" -f2`;
my $SRC_PATH=`grep dump_logs $file_cnf | cut -d "=" -f2`;
my $FINAL_LOG=`grep final_log $file_cnf | cut -d "=" -f2`;
print "\n$DEST_PATH \n $SRC_PATH \n $FINAL_LOG\n";
if ( !-d $DEST_PATH)
{
die "$DEST_PATH is not a dir";
}
else
{
print "ok";
}
the file I want to read is
cat cdc68.ini
reconn_interval=15
relogin_logs=/osp/local/home/linus/pravej/perlD/temp/relogin/
dump_logs=/osp/local/home/linus/pravej/perlD/temp/
final_log=/osp/local/home/linus/pravej/perlD/final_log/
Number_days=11
Sample output:
perl readconfig.pl cdc68.ini
/osp/local/home/linus/pravej/perlD/temp/relogin/
/osp/local/home/linus/pravej/perlD/temp/
/osp/local/home/linus/pravej/perlD/temp/relogin/
is not a dir at readconfig.pl line 26.
Can anyone suggest what I'm doing wrong here?
Please note that I dont want to use any perl module like config or tiny.pm.
Also these dir already exist in my unix system
Thanks in advance for your help
You can use perl parsing instead of shell utils
my $file_cnf = $ARGV[0];
open my $fh, "<", $file_cnf or die $!;
my %ini = map /(.+?)=(.+)/, <$fh>;
close $fh;
print "\n$ini{relogin_logs} \n $ini{dump_logs} \n $ini{final_log}\n";
mpapec's alternative parsing is much better than your current script. But, for what it's worth, your original problem is that the result of the backtick operator includes the newline character at the end of the string.
You can remove it with chomp, eg:
my $DEST_PATH=`grep relogin_logs $file_cnf | cut -d "=" -f2`;
chomp $DESTDIR;
Do this with the result of all your backtick commands.

How to add blank line after every grep result using Perl?

How to add a blank line after every grep result?
For example, grep -o "xyz" may give something like -
file1:xyz
file2:xyz
file2:xyz2
file3:xyz
I want the output to be like this -
file1:xyz
file2:xyz
file2:xyz2
file3:xyz
I would like to do something like
grep "xyz" | perl (code to add a new line after every grep result)
This is the direct answer to your question:
grep 'xyz' | perl -pe 's/$/\n/'
But this is better:
perl -ne 'print "$_\n" if /xyz/'
EDIT
Ok, after your edit, you want (almost) this:
grep 'xyz' * | perl -pe 'print "\n" if /^([^:]+):/ && ! $seen{$1}++'
If you don’t like the blank line at the beginning, make it:
grep 'xyz' * | perl -pe 'print "\n" if /^([^:]+):/ && ! $seen{$1}++ && $. > 1'
NOTE: This won’t work right on filenames with colons in them. :)½
If you want to use perl, you could do something like
grep "xyz" | perl -p -e 's/(.*)/\1\n/g'
If you want to use sed (where I seem to have gotten better results), you could do something like
grep "xyz" | sed 's/.*/\0\n/g'
This prints a newline after every single line of grep output:
grep "xyz" | perl -pe 'print "\n"'
This prints a newline in between results from different files. (Answering the question as I read it.)
grep 'xyx' * | perl -pe '/(.*?):/; if ($f ne $1) {print "\n"; $f=$1}'
Use a state machine to determine when to print a blank line:
#!/usr/bin/env perl
use strict;
use warnings;
# state variable to determine when to print a blank line
my $prev_file = '';
# change DATA to the appropriate input file handle
while( my $line = <DATA> ){
# did the state change?
if( my ( $file ) = $line =~ m{ \A ([^:]*) \: .*? xyz }msx ){
# blank lines between states
print "\n" if $file ne $prev_file && length $prev_file;
# set the new state
$prev_file = $file;
}
# print every line
print $line;
}
__DATA__
file1:xyz
file2:xyz
file2:xyz2
file3:xyz

Perl search for first occurrence of pattern in directory

I have a directory with a list of image header files of the format
image1.hd
image2.hd
image3.hd
image4.hd
I want to search for the regular expression Image type:=4 in the directory and find the file number which has the first occurrence of this pattern. I can do this with a couple of pipes easily in bash:
grep -l 'Image type:=4' image*.hd | sed ' s/.*image\(.*\).hd/\1/' | head -n1
which returns 1 in this case.
This pattern match will be used in a perl script. I know I could use
my $number = `grep -l 'Image type:=4' image*.hd | sed ' s/.*image\(.*\).hd/\1/' | head -n1`
but is it preferable to use pure perl in such cases? Here is the best I could come up with using perl. It is very cumbersome:
my $tmp;
#want to find the planar study in current study
foreach (glob "$DIR/image*.hd"){
$tmp = $_;
open FILE, "<", "$_" or die $!;
while (<FILE>)
{
if (/Image type:=4/){
$tmp =~ s/.*image(\d+).hd/$1/;
}
}
close FILE;
last;
}
print "$tmp\n";
this also returns the desired output of 1. Is there a more effective way of doing this?
This is simple with the help of a couple of utility modules
use strict;
use warnings;
use File::Slurp 'read_file';
use List::MoreUtils 'firstval';
print firstval { read_file($_) =~ /Image type:=4/ } glob "$DIR/image*.hd";
But if you are restricted to core Perl, then this will do what you want
use strict;
use warnings;
my $firstfile;
while (my $file = glob 'E:\Perl\source\*.pl') {
open my $fh, '<', $file or die $!;
local $/;
if ( <$fh> =~ /Image type:=4/) {
$firstfile = $file;
last;
}
}
print $firstfile // 'undef';

How can I tell if a filehandle is empty in Perl?

For example:
open (PS , " tail -n 1 $file | grep win " );
I want to find whether the file handle is empty or not.
You can also use eof to check whether a file handle is exhausted. Here is an illustration based loosely on your code. Also note the use of a lexical file handle with the 3-arg form of open.
use strict;
use warnings;
my ($file_name, $find, $n) = #ARGV;
open my $fh, '-|', "tail -n $n $file_name | grep $find" or die $!;
if (eof $fh){
print "No lines\n";
}
else {
print <$fh>;
}
Although calling eof before you attempt to read from it produces the result you expect in this particular case, give heed to the advice at the end of the perlfunc documentation on eof:
Practical hint: you almost never need to use eof in Perl, because the input operators typically return undef when they run out of data, or if there was an error.
Your command will produce at most one line, so stick it in a scalar, e.g.,
chomp(my $gotwin = `tail -n 1 $file | grep win`);
Note that the exit status of grep tells you whether your pattern matched:
2.3 Exit Status
Normally, the exit status is 0 if selected lines are found and 1 otherwise …
Also, tail exits 0 on success or non-zero on failure. Use that information to your advantage:
#! /usr/bin/perl
use strict;
use warnings;
my $file = "input.dat";
chomp(my $gotwin = `tail -n 1 $file | grep win`);
my $status = $? >> 8;
if ($status == 1) {
print "$0: no match [$gotwin]\n";
}
elsif ($status == 0) {
print "$0: hit! [$gotwin]\n";
}
else {
die "$0: command pipeline exited $status";
}
For example:
$ > input.dat
$ ./prog.pl
./prog.pl: no match []
$ echo win >input.dat
$ ./prog.pl
./prog.pl: hit! [win]
$ rm input.dat
$ ./prog.pl
tail: cannot open `input.dat' for reading: No such file or directory
./prog.pl: no match []
open (PS,"tail -n 1 $file|");
if($l=<PS>)
{print"$l"}
else
{print"$file is empty\n"}
well ... scratch this ... I didn't make the connection about the filehandle actually being the output of a pipe.
You should use stat to determine the size of a file but you're going to need to
ensure the file is flushed first:
#!/usr/bin/perl
my $fh;
open $fh, ">", "foo.txt" or die "cannot open foo.txt - $!\n";
my $size = (stat $fh)[7];
print "size of file is $size\n";
print $fh "Foo";
$size = (stat $fh)[7];
print "size of file is $size\n";
$fh->flush;
$size = (stat $fh)[7];
print "size of file is $size\n";
close $fh;