How to get a comment printed for each line of text that matches within a file? - perl

I am trying to match a keyword/text/line given in a file called expressions.txt from all files matching *main_log. When a match is found I want to print the comment for each line that matches.
Is there any better way to get this printed?
expression.txt
Hello World ! # I want to print this comments#
Bye* #I want this to print when Bye Is match with main_log#
:::
:::
Below Is the code I used :
{
open( my $kw, '<', 'expressions.txt' ) or die $!;
my #keywords = <$kw>;
chomp( #keywords ); # remove newlines at the end of keywords
# get list of files in current directory
my #files = grep { -f } ( <*main_log>, <*Project>, <*properties> );
# loop over each file to search keywords in
foreach my $file ( #files ) {
open( my $fh, '<', $file ) or die $!;
my #content = <$fh>;
close( $fh );
my $l = 0;
foreach my $kw ( #keywords ) {
my $search = quotemeta( $kw ); # otherwise keyword is used as regex, not literally
#$kw =~ m/\[(.*)\]/;
$kw =~ m/\((.*)\)/;
my $temp = $1;
print "$temp\n";
foreach ( #content ) { # go through every line for this keyword
$l++;
printf 'Found keyword %s in file %s, line %d:%s'.$/, $kw, $file, $l, $_ if /$search/;
}
}
}
I tried this code to print the comments mentioned within parentheses (...) but it is not printing in the fashion which I want like below:
If the expression.txt contains
Hello World ! # I want to print this comments#
If Hello World ! string is matched in my file called main_log then it should match only Hello World! from the main_log but print # I want to print this comments# as a comment for user to understand the keyword.
These keywords can be from any length or contains any character.
It worked fine but just a little doubt on printing the required output Into a file though I have used perl -w Test.pl > my_output.txt command on command prompt not sure how can I use Inside the perl script Itself
open( my $kw, '<', 'expressions.txt') or die $!;
my #keywords = <$kw>;
chomp(#keywords); # remove newlines at the end of keywords
# post-processing your keywords file
my $kwhashref = {
map {
/^(.*?)(#.*?#)*$/;
defined($2) ? ($1 => $2) : ( $1 => undef )
} #keywords
};
# get list of files in current directory
my #files = grep { -f } (<*main_log>,<*Project>,<*properties>);
# loop over each file to search keywords in
foreach my $file (#files) {
open(my $fh, '<', $file) or die $!;
my #content = <$fh>;
close($fh);
my $l = 0;
#foreach my $kw (#keywords) {
foreach my $kw (keys %$kwhashref) {
my $search = quotemeta($kw); # otherwise keyword is used as regex, not literally
#$kw =~ m/\[(.*)\]/;
#$kw =~ m/\#(.*)\#/;
#my $temp = $1;
#print "$temp\n";
foreach (#content) { # go through every line for this keyword
$l++;
if (/$search/)
{
# only print if comment defined
print $kwhashref->{$kw}."\n" if defined($kwhashref->{$kw}) ;
printf 'Found keyword %s in file %s, line %d:%s'.$/, $kw, $file, $l, $_
#printf '$output';
}
}
}
}

Your example code has mismatched braces { ... } and won't compile.
If you were to add another closing brace to the end of your code then it would compile, but the line
$kw =~ m/\((.*)\)/;
will never succeed since there are no parentheses anywhere in expressions.txt. If a match has not succeeded then the value of $1 will be retained from the most recently successful regex match operation
You are also trying to search the lines from the files against the whole of the lines retrieved from expressions.txt, when you should be splitting those lines into keywords and their corresponding comments

This seems to be the followup for this answer of another question of you. What I tried to suggest in the last paragraph would start after the first three lines of your code:
# post-processing your keywords file
my $kwhashref = {
map {
/^(.*?)(#.*?#)*$/;
defined($2) ? ($1 => $2) : ( $1 => undef )
} #keywords
};
Now you have the keywords in a hashref containing the actual keywords to search for as keys, and comments as values, if they exists (using your #comment# at the end of line syntax here).
Your keyword loop would now have to use keys %$kwhashref and you now can additionally print the comment in the inner loop, converted like shown in the answer I linked. The additional print:
print $kwhashref->{$kw}."\n" if defined($kwhashref->{$kw}); # only print if comment defined

Related

Perl - substring keywords

I have a text file where is lot of lines, I need search in this file keywords and if exist write to log file line where is keywords and line one line below and one above the keyword. Now search or write keyword not function if find write all and I dont known how can I write line below and above. Thanks for some advice.
my $vstup = "C:/Users/Omega/Documents/Kontroly/testkontroly/kontroly20220513_154743.txt";
my $log = "C:/Users/Omega/Documents/Kontroly/testkontroly/kontroly.log";
open( my $default_fh, "<", $vstup ) or die $!;
open( my $main_fh, ">", $log ) or die $!;
my $var = 0;
while ( <$default_fh> ) {
if (/\Volat\b/)
$var = 1;
}
if ( $var )
print $main_fh $_;
}
}
close $default_fh;
close $main_fh;
The approach below use one semaphore variable and a buffer variable to enable the desired behavior.
Notice that the pattern used was replaced by 'A` for simplicity testing.
#!/usr/bin/perl
use strict;
use warnings;
my ($in_fh, $out_fh);
my ($in, $out);
$in = 'input.txt';
$out = 'output.txt';
open($in_fh, "< ", $in) || die $!."\n";
open($out_fh, "> ", $out) || die $!;
my $p_next = 0;
my $p_line;
while (my $line = <$in_fh>) {
# print line after occurrence
print $out_fh $line if ($p_next);
if ($line =~ /A/) {
if (defined($p_line)) {
# print previous line
print $out_fh $p_line;
# once printed undefine variable to avoid printing it again in the next loop
undef($p_line);
}
# Print current line if not already printed as the line follows a pattern
print $out_fh $line if (!$p_next);
# toggle semaphore to print the next line
$p_next = 1;
} else {
# pattern not found.
# if pattern was not detected in both current and previous line.
$p_line = $line if (!$p_next);
$p_next = 0;
}
}
close($in_fh);
close($out_fh);

How to print the output of seaching a string from a no of files to a output File In Perl

I have the following code and I want the output to .txt file so can someone pls help to print the output Into some file ?
Rather It should have an option for a user to push to file or print to the command prompt Itself.
# Opening Keyword File here
open( my $kw, '<', 'IMSRegistration_Success_MessageFlow.txt') or die $!;
my #keywords = <$kw>;
chomp(#keywords); # remove newlines at the end of keywords
# post-processing your keywords file for adding comments
my $kwhashref = {
map {
/^(.*?)(#.*?#)*$/;
defined($2) ? ($1 => $2) : ( $1 => undef )
} #keywords
};
# get list of files in current directory
my #files = grep { -f } (<*main_log>,<*Project>,<*properties>);
# loop over each file to search keywords in
foreach my $file (#files)
{
open(my $fh, '<', $file) or die $!;
my #content = <$fh>;
close($fh);
my $l = 0;
foreach my $kw (keys %$kwhashref)
{
my $search = quotemeta($kw); # otherwise keyword is used as regex, not literally
foreach (#content)
{ # go through every line for this keyword
$l++;
if (/$search/)
{
print $kwhashref->{$kw}."\n" if defined($kwhashref->{$kw}) ;
printf 'Found keyword %s in file %s, line %d:%s'.$/, $kw, $file, $l, $_
}
}
}
}
script.pl >output.txt
I can get the output Into a file using below code:
print $out_file $kwhashref->{$kw}."\n" if defined($kwhashref->{$kw}) ;
printf $out_file 'Found keyword %s in file %s, line %d:%s'.$/, $kw, $file, $l, $_;

Perl : How to search a Indefinite list of keywords from a list of files in a folder

Can anyone help me with Perl Script on below problem:
File1.txt -> with keywords to search
Hello_
World!
+Bye
Temp-
File2 (Can be of any extension) In which Keywords to search for, File3, File4 ....
I want to search for all the keywords from File1 in File2, and If they are found then print the keyword found along with the file number and line number In which this particular keyword is found.
I want to keep these no of keywords and files to be indefinite - they can be added and modified.
open(MYINPUTFILE, "<expressions.txt");
# open for input
my(#lines) = <MYINPUTFILE>;
#print #lines;
my #files = grep ( -f ,<*main_log>,<*Project>);
$n = 0;
$l = 0;
#foreach my$file (#files) {
foreach my $line (#lines) {
my #f = grep /$line/,#files;
print "#f\n";
}
#}
}
Issue - I tried to execute the above code but It does not print anything on my command prompt. I am using Windows 7
This answer is based on your posted code:
use strict; # always use these
use warnings;
open( my $kw, '<', 'expressions.txt') or die $!;
my #keywords = <$kw>;
chomp(#keywords); # remove newlines at the end of keywords
# get list of files in current directory
my #files = grep { -f } (<*main_log>,<*Project>);
# loop over each file to search keywords in
foreach my $file (#files) {
open(my $fh, '<', $file) or die $!;
my #content = <$fh>;
close($fh);
my $l = 0;
foreach my $kw (#keywords) {
my $search = quotemeta($kw); # otherwise keyword is used as regex, not literally
foreach (#content) { # go through every line for this keyword
$l++;
printf 'Found keyword %s in file %s, line %d:%s'.$/, $kw, $file, $l, $_
if /$search/;
}
}
}
Regarding the questions in the comments below:
The innermost loop just counts for line numbers $l++ and puts out the finds in case of occurence - the if /$search/ is still part of the statement above. It could also be written as
if ( /$search/ ) {
printf ...
}
The printf is used to format the output. You could have also done this by simple using print and concatinate all the needed variables. I just prefer it this way.
This assumes, that you want a list of found lines per keyword for every file. You have to switch the order and logic for #keywords and #content to get it line ordered.
For additional functionality regarding comments in the keyword file, you would have to postprocess the content to discern the search terms from comments. Possibly in a hash with search term as key and comment as value. Then you could use only the hash keys for the search (see innermost loop) and put out the comment, if existing, as additional line.

Perl Programming

I have these questions. But I don't know how to prove it or if I'm right. Are my answers right?
Find all complete lines of a file which contain only a row of any number of the letter x
x*
^x+$
^x*$ <-This one
^xxxxx$
Find all complete lines of a file which contain a row consisting only the letter x but ignoring any leading or trailing space on the line.
^\s* x+\s*$ <--This one
^\s(x*)\s$
\s* x+\s*
^\s+x+\s+$
I tried to use this
use strict;
use warnings;
my $filename = 'data.txt';
open( my $fh, '<:encoding(UTF-8)', $filename ) or die "Could not open file '$filename' $!";
while ( my $row = <$fh> ) {
chomp $row;
print "$row\n";
}
I tried this code but I got error at (^
use strict;
use warnings;
my $filename = 'data.txt';
open( my $fh, '<:encoding(UTF-8)', $filename ) or die "Could not open file '$filename' $!";
while ( my $row = <$fh> ) {
if ( ^x*$ ) {
print "This is";
}
}
You're talking about regular expressions and how to use them in Perl. Your question seems to be whether the answers you picked to homework are correct.
The code you've added should do what you want, but it has syntax errors.
if ( ^x*$ ) {
print "This is";
}
Your pattern is correct, but you don't know how to use a regular expression in Perl. You're missing the actual operator to tell Perl that you want a regular expression.
The short form is this, where I've highlighted the important part with #
if ( /^x*$/ ) {
# #
The slashes // tell Perl that it should match a pattern. The long form of it is:
if ( $_ =~ m/^x*$/ ) {
## ## ## #
$_ is the variable that you are matching against a pattern. The =~ is the matching operator. The m// constructs a pattern to match with. If you use // you can leave out the m, but it's clearer to put it in.
The $_ is called topic. It's like a default variable that stuff goes into in Perl if you don't specify another variable.
while ( <$fh> ) {
print $_ if $_ =~ m/foo/; # print all lines that contain foo
}
This code can be written as $_, because a lot of commands in Perl assume that you mean $_ when you don't explicitly name a variable.
while ( <$fh> ) { # puts each line in $_
print if m/foo/; # prints $_ if $_ contains foo
}
You code looks like you wanted to do that, but in fact you have a $row in your loop. That's good, because it is more explicit. That means it's easier to read. So what you need to do for your match is:
while ( my $row = <$fh> ) {
if ( $row =~ m/^x*$/ ) {
print "This is";
}
}
Now you will iterate each line of the file behind the $fh filehandle, and check if it matches the pattern ^x*$. If it does, you print _"This is". That doesn't sound very useful.
Consider this example, where I am using the __DATA__ section instead of a file.
use strict;
use warnings;
while ( my $row = <DATA> ) {
if ( $row =~ m/^x*$/ ) {
print "This is";
}
}
__DATA__
foo
xxx
x
xxxxx
bar
This will print:
This isThis isThis isThis is
It really does not seem to be very useful. It would make more sense to include the line that matched.
if ( $row =~ m/^x*$/ ) {
print "match: $row";
}
Now we get this:
match: xxx
match:
match: x
match: xxxxx
That's almost what we expected. It matches a single x, and a bunch of xs. It did not match foo or bar. But it does match an empty line.
That's because you picked the wrong pattern.
The * multiplier means match as many as possible, as least none.
The + multiplier means match as many as possible, at least one.
So your pattern should be the one with +, or it will match if there is nothing, because start of the line, no x, end of the line matches an empty line.
While you're at it, you could also rename your variable. Unless you're dealing with CSV, which has rows of data, you have lines, not rows. So $line would be a better name for your variable. Giving variables good, descriptive names is very important because it makes it easier to understand your program.
use strict;
use warnings;
my $filename = 'data.txt';
open( my $fh, '<:encoding(UTF-8)', $filename )
or die "Could not open file '$filename' $!";
while ( my $line = <$fh> ) {
if ( $line =~ m/^x+$/ ) {
print "match: $line";
}
}

zcat working in command line but not in perl script

Here is a part of my script:
foreach $i ( #contact_list ) {
print "$i\n";
$e = "zcat $file_list2| grep $i";
print "$e\n";
$f = qx($e);
print "$f";
}
$e prints properly but $f gives a blank line even when $file_list2 has a match for $i.
Can anyone tell me why?
Always is better to use Perl's grep instead of using pipe :
#lines = `zcat $file_list2`; # move output of zcat to array
die('zcat error') if ($?); # will exit script with error if zcat is problem
# chomp(#lines) # this will remove "\n" from each line
foreach $i ( #contact_list ) {
print "$i\n";
#ar = grep (/$i/, #lines);
print #ar;
# print join("\n",#ar)."\n"; # in case of using chomp
}
Best solution is not calling zcat, but using zlib library :
http://perldoc.perl.org/IO/Zlib.html
use IO::Zlib;
# ....
# place your defiiniton of $file_list2 and #contact list here.
# ...
$fh = new IO::Zlib; $fh->open($file_list2, "rb")
or die("Cannot open $file_list2");
#lines = <$fh>;
$fh->close;
#chomp(#lines); #remove "\n" symbols from lines
foreach $i ( #contact_list ) {
print "$i\n";
#ar = grep (/$i/, #lines);
print (#ar);
# print join("\n",#ar)."\n"; #in case of using chomp
}
Your question leaves us guessing about many things, but a better overall approach would seem to be opening the file just once, and processing each line in Perl itself.
open(F, "zcat $file_list |") or die "$0: could not zcat: $!\n";
LINE:
while (<F>) {
######## FIXME: this could be optimized a great deal still
foreach my $i (#contact_list) {
if (m/$i/) {
print $_;
next LINE;
}
}
}
close (F);
If you want to squeeze out more from the inner loop, compile the regexes from #contact_list into a separate array before the loop, or perhaps combine them into a single regex if all you care about is whether one of them matched. If, on the other hand, you want to print all matches for one pattern only at the end when you know what they are, collect matches into one array per search expression, then loop them and print when you have grepped the whole set of input files.
Your problem is not reproducible without information about what's in $i, but I can guess that it contains some shell metacharacter which causes it to be processed by the shell before the grep runs.