How to extract a specific string using perl? - perl

I have set of strings say "-f /path/filename1.f", "-f $path/filename2.f", etc in a single file file.f I want to read file.f and extract /path/filename1.f, $path/filename2.f, etc in another file.
I tried finding solution online but it looks like a mess.
Is there any clean and simple solution there for these kind of simple pattern searching?
below is the requirement
Example,
file.f (input file to perl script)
-f /path/filename1.f
-f $path1/filename2.f
-f /path/filename3.f
-f $path2/filename4.f
outputfile.f
/path/filename1.f
$path1/filename2.f
/path/filename3.f
$path2/filename4.f
Basically I just want path string from the file.f

Some perl code to solve your problem:
use strict;
use warnings;
open my $fhi, "<", "file.f" or die "Error: $!";
open my $fho, ">", "output.f" or die "Error: $!";
while( <$fhi> ) { # Read each line in $_ variable
s/^-f //; # Remove "-f " at the beginning of $_
print $fho $_; # print $_ to output.f file
}
close $fhi;
close $fho;

The simplest way is using cut:
cut -f2 -d’ ‘ input_file > output_file
Or you can use Perl:
perl -lane ‘print $F[1]’ input_file > output_file
These solutions extract the second field of the input and print it.

Look into the below solution -
Here everything after -f will be taken out.
#!/usr/bin/perl
use strict;
use warnings;
open(FILE,"<file.f");
while(<FILE>)
print "$1\n" if($_ =~ /^-f\s(.*)/);
}

Related

Getting error while replacing word using perl

I am writing a script for replacing 2 words from a text file. The script is
count=1
for f in *.pdf
do
filename="$(basename $f)"
filename="${filename%.*}"
filename="${filename//_/ }"
echo $filename
echo $f
perl -pe 's/intime_mean_pu.pdf/'$f'/' fig.tex > fig_$count.tex
perl -pi 's/TitleFrame/'$filename'/' fig_$count.tex
sed -i '/Pointer-rk/r fig_'$count'.tex' $1.tex
count=$((count+1))
done
But the replacing of words using the second perl command is giving error:
Can't open perl script "s/TitleFrame/Masses1/": No such file or directory
Please suggest what I am doing wrong.
You could change your script to something like this:
#!/bin/bash
for f in *.pdf; do
filename=$(basename "$f" .pdf)
filename=${filename//_/}
perl -spe 's/intime_mean_pu.pdf/$a/;
s/TitleFrame/$b/' < fig.tex -- -a="$f" -b="$filename" > "fig_$count.tex"
sed -i "/Pointer-rk/r fig_$count.tex" "$1.tex"
((++count))
done
As well as some other minor changes to your script, I have made use of the -s switch to Perl, which means that you can pass arguments to the one-liner. The bash variables have been double quoted to avoid problems with spaces in filenames, etc.
Alternatively, you could do the whole thing in Perl:
#!/usr/bin/env perl
use strict;
use warnings;
use autodie;
use File::Basename;
my $file_arg = shift;
my $count = 1;
for my $f (glob "*.pdf") {
my $name = fileparse($f, qq(.pdf));
open my $in, "<", $file_arg;
open my $out, ">", 'tmp';
open my $fig, "<", 'fig.tex';
# copy up to match
while (<$in>) {
print $out $_;
last if /Pointer-rk/;
}
# insert contents of figure (with substitutions)
while (<$fig>) {
s/intime_mean_pu.pdf/$f/;
s/TitleFrame/$name/;
print $out $_;
}
# copy rest of file
print $out $_ while <$in>;
rename 'tmp', $file_arg;
++$count;
}
Use the script like perl script.pl "$1.tex".
You're missing the -e in the second perl call

Perl script, parse text file between words

I have a text file that looks like this:
... //John/box/sandbox/users/abc/project/build/file2
... //John/box/sandbox/users/cde/project/build/file1
... //John/box/sandbox/users/hdf/project/config/file
Using a Perl script, how can I parse this file so that my final output is:
//John/box/sandbox/users/abc/project/
//John/box/sandbox/users/cde/project/
//John/box/sandbox/users/hdf/project/
Basically my ultimate goal is to search for "//" and "project" on the same line and then take everything between them.
Thanks for the fast response, Both doesn't seems to work for me
I'm using perl 5.8.3 build 809
perl -nle 'print $1 if m#(//.*project/)#;' output.txt
use FileHandle;
use Env;
use Tk;
use File::Copy;
open(DAT, "output.txt") || die("Could not open file!");
my $input = <DAT>;
while (<$input>){
chomp;
print "$1\n" if ($_ =~ /(^\/\/.*project\/)/);
}
Everyone thank you for your help. It worked fine, i had to remove ^.
For future questions i will add my work, sorry this is my first question. Human make mistakes :)
my $infile = 'in.txt';
open my $input, '<', $infile or die "Can't open to $infile: $!";
while (<$input>){
chomp;
print "$1\n" if ($_ =~ /(\/\/.*project\/)/);
}
This is simple enough to do as a command-line filter:
perl -nle'print $1 if m#(//.*project/)#;' output.txt

Find file which content not match a string pattern in Perl

I'm writing a code to find the file which not contain a string pattern. Provided I have a list of files, I have to look into the content of each file, I would like to get the file name if the string pattern "clean" not appear inside the file. Pls help.
Here is the scenario:
I have a list of files, inside each file is having numerous of lines. If the file is clean, it will have the "clean" wording. But if the file is dirty, the "clean" wording not exist and there will be no clear indication to tell the file is dirty. So as long as inside each file, if the "clean" wording is not detect, I'll category it as dirty file and I would like to trace the file name
You can use a simple one-liner:
perl -0777 -nlwE 'say $ARGV if !/clean/i' *.txt
Slurping the file with -0777, making the regex check against the entire file. If the match is not found, we print the file name.
For perl versions lower than 5.10 that do not support -E you can substitute -E with -e and say $ARGV with print "$ARGV".
perl -0777 -nlwe 'print "$ARGV\n" if !/clean/i' *.txt
If you need to generate the list within Perl, the File::Finder module will make life easy.
Untested, but should work:
use File::Finder;
my #wanted = File::Finder # finds all ..
->type( 'f' ) # .. files ..
->name( '*.txt' ) # .. ending in .txt ..
->in( '.' ) # .. in current dir ..
->not # .. that do not ..
->contains( qr/clean/ ); # .. contain "clean"
print $_, "\n" for #wanted;
Neat stuff!
EDIT:
Now that I have a clearer picture of the problem, I don't think any module is necessary here:
use strict;
use warnings;
my #files = glob '*.txt'; # Dirty & clean laundry
my #dirty;
foreach my $file ( #files ) { # For each file ...
local $/ = undef; # Slurps the file in
open my $fh, $file or die $!;
unless ( <$fh> =~ /clean/ ) { # if the file isn't clean ..
push #dirty, $file; # .. it's dirty
}
close $fh;
}
print $_, "\n" for #dirty; # Dirty laundry list
Once you get the mechanics, this can be simplified a la grep, etc.
One way like this:
ls *.txt | grep -v "$(grep -l clean *.txt)"
#!/usr/bin/perl
use strict;
use warnings;
open(FILE,"<file_list_file>");
while(<FILE>)
{
my $flag=0;
my $filename=$_;
open(TMPFILE,"$_");
while(<TMPFILE>)
{
$flag=1 if(/<your_string>/);
}
close(TMPFILE);
if(!$flag)
{
print $filename;
}
}
close(FILE);

Find Particular String in File and Count How many Times it is repeated using perl

I have a Long File Say 10000 Lines.
That is same set of Data Repeated , Like 10 lines and next ten line will be Same.
I want to Find Say "ObjectName" String in that file and Count it, How Many Times is appearing in that file.
Can anyone post detailed code. I am new to Perl.
Using Perl:
perl -ne '$x+=s/objname//g;END{print $x,"\n";}' file
Updated:
Since OP wants the solution using handlers:
#!/usr/bin/perl
use warnings;
use strict;
open my $fh , '<' , 'f.txt' or die 'Cannot open file';
my $x=0;
while (<$fh>){
chomp;
$x+=s/objname//g;
}
close $fh;
print "$x";
Here's another option that also addresses your comment about searching in a whole directory:
#!/usr/bin/env perl
use warnings;
use strict;
my $dir = '.';
my $count = 0;
my $find = 'ObjectName';
for my $file (<$dir/*.txt>) {
open my $fh, '<', $file or die $!;
while (<$fh>) {
$count += /\Q$find\E/g;
}
close $fh;
}
print $count;
The glob denoted by <$dir/*.txt> will non-recursively get the names of all text files in the directory $dir. If you want all files, change it to <$dir/*>. Each file is opened and read, line-by-line. The regex /\Q$find\E/g globally matches the contents of $find against each line. The \Q ... \E notation escapes any meta-characters in the string you're looking for, else those characters may interfere with the matching.
Hope this helps!
This could be a one liner in bash
grep "ObjectName " <filename> | wc -l

Read multiple columns in perl using while loop

I am stuck in a problem in Perl.
I want to read multiple columns in 1 line using while loop.
I know I can achieve this using shell script like below
cat file.txt|while read field1 field2 field3 field4
do
statement1
statement2
done
The same thing I want in Perl but don't understand how to get this.
Please help me.
Thanks in advance,
Sumana
In a loop, you can do this:
#!/usr/bin/perl -w
use strict;
my $file = "MYFILE";
open (my $fh, '<', $file) or die "Can't open $file for read: $!";
my #lines;
while (<$fh>) {
my ($field1, $field2, $field3) = split;
}
close $fh or die "Cannot close $file: $!";
In the loop, Perl will assign $_ the next line of the file, and with no args, split will split that variable on white space.
use
perl -F -ane '....' your file
-F flag will store each field in an array #F.so u can use $F[0] for the first field.
for example:
perl -F -ane 'print $F[0]' your file
will print the first field of every line
if you are concerned about performance:
perl -lne "my($f,$s,$t)=split;print 'first='.$f.' second='.$s.' third='.$t" your_file
for a big example :also check this