I want to create a Perl program to take in a file, and for each line, chomp off everything after a certain character (let's say a /). For example, consider this example file:
foo1/thing 1.1.1 bar
foo2/item 2.3.2 bar
foo3/thing 3.4.5 bar
I want to remove everything after the slash on each line and print it out, so that that file becomes:
foo1
foo2
foo3
I tried to use this program, with readline in a foreach loop, but the output was not what I expected:
print ( "Enter file name: " ) ;
my $filename = <> ;
$/ = ''
chomp $filename ;
my $file = undef ;
open ( $file, "< :encoding(UTF-8)", $filename
$/ = '/' ;
foreach ( <$file> ) {
chomp ;
print ;
}
But all this does is remove the slashes from each line.
foo1thing 1.1.1 bar
foo2item 2.3.2 bar
foo3thing 3.4.5 bar
How can I alter this to produce the output I need?
As far as concerns, the input record separator ($/) does not allow regexes.
You could proceed as follows:
print ( "Enter file name: " ) ;
my $filename = <> ;
chomp $filename ;
open ( my $file, "< :encoding(UTF-8)", $filename )
or die "could not open file $filename: $!";
while ( my $line = <$file> ) {
$line =~ s{/.*}{}s;
print "$line\n";
}
Regexp s{/.*}{}s matches on the first slash and everything afterwards, and suppresses it (along with the trailing new line).
Note: always check for errors when using open(), as noted in the documentation:
When opening a file, it's seldom a good idea to continue if the request failed, so open is frequently used with die.
$line =~ s{/.*}{}s; # In-place (destructive)
or
my ($extracted) = $line =~ m{([^/]*)}; # Returns (non-destructive)
Related
I am trying to match a keyword/text/line given in a file called expressions.txt from all files matching *main_log. When a match is found I want to print the comment for each line that matches.
Is there any better way to get this printed?
expression.txt
Hello World ! # I want to print this comments#
Bye* #I want this to print when Bye Is match with main_log#
:::
:::
Below Is the code I used :
{
open( my $kw, '<', 'expressions.txt' ) or die $!;
my #keywords = <$kw>;
chomp( #keywords ); # remove newlines at the end of keywords
# get list of files in current directory
my #files = grep { -f } ( <*main_log>, <*Project>, <*properties> );
# loop over each file to search keywords in
foreach my $file ( #files ) {
open( my $fh, '<', $file ) or die $!;
my #content = <$fh>;
close( $fh );
my $l = 0;
foreach my $kw ( #keywords ) {
my $search = quotemeta( $kw ); # otherwise keyword is used as regex, not literally
#$kw =~ m/\[(.*)\]/;
$kw =~ m/\((.*)\)/;
my $temp = $1;
print "$temp\n";
foreach ( #content ) { # go through every line for this keyword
$l++;
printf 'Found keyword %s in file %s, line %d:%s'.$/, $kw, $file, $l, $_ if /$search/;
}
}
}
I tried this code to print the comments mentioned within parentheses (...) but it is not printing in the fashion which I want like below:
If the expression.txt contains
Hello World ! # I want to print this comments#
If Hello World ! string is matched in my file called main_log then it should match only Hello World! from the main_log but print # I want to print this comments# as a comment for user to understand the keyword.
These keywords can be from any length or contains any character.
It worked fine but just a little doubt on printing the required output Into a file though I have used perl -w Test.pl > my_output.txt command on command prompt not sure how can I use Inside the perl script Itself
open( my $kw, '<', 'expressions.txt') or die $!;
my #keywords = <$kw>;
chomp(#keywords); # remove newlines at the end of keywords
# post-processing your keywords file
my $kwhashref = {
map {
/^(.*?)(#.*?#)*$/;
defined($2) ? ($1 => $2) : ( $1 => undef )
} #keywords
};
# get list of files in current directory
my #files = grep { -f } (<*main_log>,<*Project>,<*properties>);
# loop over each file to search keywords in
foreach my $file (#files) {
open(my $fh, '<', $file) or die $!;
my #content = <$fh>;
close($fh);
my $l = 0;
#foreach my $kw (#keywords) {
foreach my $kw (keys %$kwhashref) {
my $search = quotemeta($kw); # otherwise keyword is used as regex, not literally
#$kw =~ m/\[(.*)\]/;
#$kw =~ m/\#(.*)\#/;
#my $temp = $1;
#print "$temp\n";
foreach (#content) { # go through every line for this keyword
$l++;
if (/$search/)
{
# only print if comment defined
print $kwhashref->{$kw}."\n" if defined($kwhashref->{$kw}) ;
printf 'Found keyword %s in file %s, line %d:%s'.$/, $kw, $file, $l, $_
#printf '$output';
}
}
}
}
Your example code has mismatched braces { ... } and won't compile.
If you were to add another closing brace to the end of your code then it would compile, but the line
$kw =~ m/\((.*)\)/;
will never succeed since there are no parentheses anywhere in expressions.txt. If a match has not succeeded then the value of $1 will be retained from the most recently successful regex match operation
You are also trying to search the lines from the files against the whole of the lines retrieved from expressions.txt, when you should be splitting those lines into keywords and their corresponding comments
This seems to be the followup for this answer of another question of you. What I tried to suggest in the last paragraph would start after the first three lines of your code:
# post-processing your keywords file
my $kwhashref = {
map {
/^(.*?)(#.*?#)*$/;
defined($2) ? ($1 => $2) : ( $1 => undef )
} #keywords
};
Now you have the keywords in a hashref containing the actual keywords to search for as keys, and comments as values, if they exists (using your #comment# at the end of line syntax here).
Your keyword loop would now have to use keys %$kwhashref and you now can additionally print the comment in the inner loop, converted like shown in the answer I linked. The additional print:
print $kwhashref->{$kw}."\n" if defined($kwhashref->{$kw}); # only print if comment defined
I have these questions. But I don't know how to prove it or if I'm right. Are my answers right?
Find all complete lines of a file which contain only a row of any number of the letter x
x*
^x+$
^x*$ <-This one
^xxxxx$
Find all complete lines of a file which contain a row consisting only the letter x but ignoring any leading or trailing space on the line.
^\s* x+\s*$ <--This one
^\s(x*)\s$
\s* x+\s*
^\s+x+\s+$
I tried to use this
use strict;
use warnings;
my $filename = 'data.txt';
open( my $fh, '<:encoding(UTF-8)', $filename ) or die "Could not open file '$filename' $!";
while ( my $row = <$fh> ) {
chomp $row;
print "$row\n";
}
I tried this code but I got error at (^
use strict;
use warnings;
my $filename = 'data.txt';
open( my $fh, '<:encoding(UTF-8)', $filename ) or die "Could not open file '$filename' $!";
while ( my $row = <$fh> ) {
if ( ^x*$ ) {
print "This is";
}
}
You're talking about regular expressions and how to use them in Perl. Your question seems to be whether the answers you picked to homework are correct.
The code you've added should do what you want, but it has syntax errors.
if ( ^x*$ ) {
print "This is";
}
Your pattern is correct, but you don't know how to use a regular expression in Perl. You're missing the actual operator to tell Perl that you want a regular expression.
The short form is this, where I've highlighted the important part with #
if ( /^x*$/ ) {
# #
The slashes // tell Perl that it should match a pattern. The long form of it is:
if ( $_ =~ m/^x*$/ ) {
## ## ## #
$_ is the variable that you are matching against a pattern. The =~ is the matching operator. The m// constructs a pattern to match with. If you use // you can leave out the m, but it's clearer to put it in.
The $_ is called topic. It's like a default variable that stuff goes into in Perl if you don't specify another variable.
while ( <$fh> ) {
print $_ if $_ =~ m/foo/; # print all lines that contain foo
}
This code can be written as $_, because a lot of commands in Perl assume that you mean $_ when you don't explicitly name a variable.
while ( <$fh> ) { # puts each line in $_
print if m/foo/; # prints $_ if $_ contains foo
}
You code looks like you wanted to do that, but in fact you have a $row in your loop. That's good, because it is more explicit. That means it's easier to read. So what you need to do for your match is:
while ( my $row = <$fh> ) {
if ( $row =~ m/^x*$/ ) {
print "This is";
}
}
Now you will iterate each line of the file behind the $fh filehandle, and check if it matches the pattern ^x*$. If it does, you print _"This is". That doesn't sound very useful.
Consider this example, where I am using the __DATA__ section instead of a file.
use strict;
use warnings;
while ( my $row = <DATA> ) {
if ( $row =~ m/^x*$/ ) {
print "This is";
}
}
__DATA__
foo
xxx
x
xxxxx
bar
This will print:
This isThis isThis isThis is
It really does not seem to be very useful. It would make more sense to include the line that matched.
if ( $row =~ m/^x*$/ ) {
print "match: $row";
}
Now we get this:
match: xxx
match:
match: x
match: xxxxx
That's almost what we expected. It matches a single x, and a bunch of xs. It did not match foo or bar. But it does match an empty line.
That's because you picked the wrong pattern.
The * multiplier means match as many as possible, as least none.
The + multiplier means match as many as possible, at least one.
So your pattern should be the one with +, or it will match if there is nothing, because start of the line, no x, end of the line matches an empty line.
While you're at it, you could also rename your variable. Unless you're dealing with CSV, which has rows of data, you have lines, not rows. So $line would be a better name for your variable. Giving variables good, descriptive names is very important because it makes it easier to understand your program.
use strict;
use warnings;
my $filename = 'data.txt';
open( my $fh, '<:encoding(UTF-8)', $filename )
or die "Could not open file '$filename' $!";
while ( my $line = <$fh> ) {
if ( $line =~ m/^x+$/ ) {
print "match: $line";
}
}
I have a simple text file that includes all 50 states. I want the user to enter a word and have the program return the line the specific state is on in the file or otherwise display a "word not found" message. I do not know how to use find. Can someone assist with this? This is what I have so far.
#!/bin/perl -w
open(FILENAME,"<WordList.txt"); #opens WordList.txt
my(#list) = <FILENAME>; #read file into list
my($state); #create private "state" variable
print "Enter a US state to search for: \n"; #Print statement
$line = <STDIN>; #use of STDIN to read input from user
close (FILENAME);
An alternative solution that reads only the parts of the file until a result is found, or the file is exhausted:
use strict;
use warnings;
print "Enter a US state to search for: \n";
my $line = <STDIN>;
chomp($line);
# open file with 3 argument open (safer)
open my $fh, '<', 'WordList.txt'
or die "Unable to open 'WordList.txt' for reading: $!";
# read the file until result is found or the file is exhausted
my $found = 0;
while ( my $row = <$fh> ) {
chomp($row);
next unless $row eq $line;
# $. is a special variable representing the line number
# of the currently(most recently) accessed filehandle
print "Found '$line' on line# $.\n";
$found = 1; # indicate that you found a result
last; # stop searching
}
close($fh);
unless ( $found ) {
print "'$line' was not found\n";
}
General notes:
always use strict; and use warnings; they will save you from a wide range of bugs
3 argument open is generally preferred, as well as the or die ... statement. If you are unable to open the file, reading from the filehandle will fail
$. documentation can be found in perldoc perlvar
Tool for the job is grep.
chomp ( $line ); #remove linefeeds
print "$line is in list\n" if grep { m/^\Q$line\E$/g } #list;
You could also transform your #list into a hash, and test that, using map:
my %cities = map { $_ => 1 } #list;
if ( $cities{$line} ) { print "$line is in list\n";}
Note - the above, because of the presence of ^ and $ is an exact match (and case sensitive). You can easily adjust it to support fuzzier scenarios.
I want to open a file and replace a word from a file.
My code is attached here.
open(my $fh, "<", "pcie_7x_v1_7.v") or die "cannot open <pcie_7x_v1_7.v:$!";
while (my $line = <$fh>) {
if ($line =~ timescale 1 ns) {
print $line $msg = "pattern found \n ";
print "$msg";
$line =~ s/`timescale 1ns/`timescale 1ps/;
}
else {
$msg = "pattern not found \n ";
print "$msg";
}
}
File contains pattern timescale 1ns/1ps.
My requirement is to replace timescale 1ns/1ps to be replaced with timescale 1ps/1ps.
At present else condition occurs always.
Update code after receiving comment:
Hi,
Thanks for the quick solution.
I changed the code accordingly, but the result was not successful.
I have attached the updated code here.
Please suggest me if I missed anything here.
use strict;
use warnings;
open(my $fh, "<", "pcie_7x_v1_7.v" )
or die "cannot open <pcie_7x_v1_7.v:$!" ;
open( my $fh2, ">", "cie_7x_v1_7.v2")
or die "cannot open <pcie_7x_v1_7.v2:$!" ;
while(my $line = <$fh> )
{
print $line ;
if ($_ =~ /timescale\s1ns/ )
{
$msg = "pattern found \n " ;
print "$msg" ;
$_ =~ s/`timescale 1ns/`timescale 1ps/g ;
}
else
{
$msg = "pattern not found \n " ;
print "$msg" ;
}
print $fh2 $line ;
}
close($fh) ;
close($fh2) ;
Result:
pattern not found
pattern not found
pattern not found
pattern not found
Regards,
Binu
3rd update:
// File : pcie_7x_v1_7.v
// Version : 1.7
//
// Description: 7-series solution wrapper : Endpoint for PCI Express
//
//--------------------------------------------------------------------------------
//`timescale 1ps/1ps
`timescale 1ns/1ps
(* CORE_GENERATION_INFO = "pcie_7x_v1_7,pcie_7x_v1_7,
You can use a perl oneliner from a command line. No need to write a script.
perl -p -i -e "s/`timescale\s1ns/`timescale 1ps/g" pcie_7x_v1_7.v
-
However,
If you still want to use the script, you are almost there. You just need to fix a couple errors
print $line; #missing
if ($line =~ /timescale\s1ns/) #made it a real regex, this should match now
$line =~ s/`timescale 1ns/`timescale 1ps/g ; #added g to match all occurences in line
after the if-else you must print the line to a file again
for example, open a new file for writing (let's call it 'pcie_7x_v1_7.v.2') at the beginning of your script
open(my $fh2, ">", "pcie_7x_v1_7.v.2" ) or die "cannot open <pcie_7x_v1_7.v.2:$!" ;
then , after the else block just print the line (whether it's changed or not) to the file
print $fh2 $line;
Don't forget to close the filehandles when you're done
close($fh);
close($fh2);
EDIT:
Your main problem was that you used $_ for the check, while you had assigned the line to $line. So you did print $line, but then if ($_ =~ /timescale/. That would never work.
I'm copy pasting your script and made a couple corrections and formatted it a little more dense to better fit in the website. I also removed the if match check as suggested by TLP and directly did the substitution in the if. It has exactly the same result. This works:
use strict;
use warnings;
open(my $fh, "<", "pcie_7x_v1_7.v" )
or die "cannot open <pcie_7x_v1_7.v:$!" ;
open( my $fh2, ">", "pcie_7x_v1_7.v2")
or die "cannot open >pcie_7x_v1_7.v2:$!" ;
while(my $line = <$fh> ) {
print $line;
if ($line =~ s|`timescale 1ns/1ps|`timescale 1ps/1ns|g) {
print "pattern found and replaced\n ";
}
else {
print "pattern not found \n ";
}
print $fh2 $line ;
}
close($fh);
close($fh2);
#now it's finished, just overwrite the old file with the new file
rename "pcie_7x_v1_7.v2", "pcie_7x_v1_7.v";
I am still working on cleaning up Oracle files, having to replace strings in files where the Oracle schema name is prepended to the function/procedure/package name within the file, as well as when the function/procedure/package name is double-quoted. Once the definition is corrected, I write the correction back to the file, along with the rest of the actual code.
I have code written to replace simple declarations (no input/output parameters) Now I am trying to get my regex to operate on (Note: This post is a continuation from this question) Some examples of what I'm trying to clean up:
Replace:
CREATE OR REPLACE FUNCTION "TRON2000"."DC_F_DUMP_CSV_MMA" (
p_trailing_separator IN BOOLEAN DEFAULT FALSE,
p_max_linesize IN NUMBER DEFAULT 32000,
p_mode IN VARCHAR2 DEFAULT 'w'
)
RETURN NUMBER
IS
to
CREATE OR REPLACE FUNCTION DC_F_DUMP_CSV_MMA (
p_trailing_separator IN BOOLEAN DEFAULT FALSE,
p_max_linesize IN NUMBER DEFAULT 32000,
p_mode IN VARCHAR2 DEFAULT 'w'
)
RETURN NUMBER
IS
I have been trying to use the following regex to separate the declaration, for later reconstruction after I've cleaned out the schema name / fixed the name of the function/procedure/package to not be double-quoted. I am struggling with getting each into a buffer - here's my latest attempt to grab all the middle input/output into it's own buffer:
\b(CREATE\sOR\sREPLACE\s(PACKAGE|PACKAGE\sBODY|PROCEDURE|FUNCTION))(?:\W+\w+){1,100}?\W+(RETURN)\s*(\W+\w+)\s(AS|IS)\b
Any / all help is GREATLY appreciated!
This is the script that I'm using right now to evaluate / write the corrected files:
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
use Data::Dumper;
# utility to clean strings
sub trim($) {
my $string = shift;
$string = "" if !defined($string);
$string =~ s/^\s+//;
$string =~ s/\s+$//;
# aggressive removal of blank lines
$string =~ s/\n+/\n/g;
return $string;
}
sub cleanup_packages {
my $file = shift;
my $tmp = $file . ".tmp";
my $package_name;
open( OLD, "< $file" ) or die "open $file: $!";
open( NEW, "> $tmp" ) or die "open $tmp: $!";
while ( my $line = <OLD> ) {
# look for the first line of the file to contain a CREATE OR REPLACE STATEMENT
if ( $line =~
m/^(CREATE\sOR\sREPLACE)\s*(PACKAGE|PACKAGE\sBODY)?\s(.+)\s(AS|IS)?/i
)
{
# look ahead to next line, in case the AS/IS is next
my $nextline = <OLD>;
# from the above IF clause, the package name is in buffer 3
$package_name = $3;
# if the package name and the AS/IS is on the same line, and
# the package name is quoted/prepended by the TRON2000 schema name
if ( $package_name =~ m/"TRON2000"\."(\w+)"(\s*|\S*)(AS|IS)/i ) {
# grab just the name and the AS/IS parts
$package_name =~ s/"TRON2000"\."(\w+)"(\s*|\S*)(AS|IS)/$1 $2/i;
trim($package_name);
}
elsif ( ( $package_name =~ m/"TRON2000"\."(\w+)"/i )
&& ( $nextline =~ m/(AS|IS)/ ) )
{
# if the AS/IS was on the next line from the name, put them together on one line
$package_name =~ s/"TRON2000"\."(\w+)"(\s*|\S*)/$1/i;
$package_name = trim($package_name) . ' ' . trim($nextline);
trim($package_name); # remove trailing carriage return
}
# now put the line back together
$line =~
s/^(CREATE\sOR\sREPLACE)\s*(PACKAGE|PACKAGE\sBODY|FUNCTION|PROCEDURE)?\s(.+)\s(AS|IS)?/$1 $2 $package_name/ig;
# and print it to the file
print NEW "$line\n";
}
else {
# just a normal line - print it to the temp file
print NEW $line or die "print $tmp: $!";
}
}
# close up the files
close(OLD) or die "close $file: $!";
close(NEW) or die "close $tmp: $!";
# rename the temp file as the original file name
unlink($file) or die "unlink $file: $!";
rename( $tmp, $file ) or die "can't rename $tmp to $file: $!";
}
# find and clean up oracle files
sub eachFile {
my $ext;
my $filename = $_;
my $fullpath = $File::Find::name;
if ( -f $filename ) {
($ext) = $filename =~ /(\.[^.]+)$/;
}
else {
# ignore non files
return;
}
if ( $ext =~ /(\.spp|\.sps|\.spb|\.sf|\.sp)/i ) {
print "package: $filename\n";
cleanup_packages($fullpath);
}
else {
print "$filename not specified for processing!\n";
}
}
MAIN:
{
my ( #files, $file );
my $dir = 'C:/1_atest';
# grab all the files for cleanup
find( \&eachFile, "$dir/" );
#open and evaluate each
foreach $file (#files)
{
# skip . and ..
next if ( $file =~ /^\.$/ );
next if ( $file =~ /^\.\.$/ );
cleanup_file($file);
};
}
Assuming the entire content of a file is stored as scalar in a var, the following should do the trick.
$Str = '
CREATE OR REPLACE FUNCTION "TRON2000"."DC_F_DUMP_CSV_MMA" (
p_trailing_separator IN BOOLEAN DEFAULT FALSE,
p_max_linesize IN NUMBER DEFAULT 32000,
p_mode IN VARCHAR2 DEFAULT w
)
RETURN NUMBER
IS
CREATE OR REPLACE FUNCTION "TRON2000"."DC_F_DUMP_CSV_MMA" (
p_trailing_separator IN BOOLEAN DEFAULT FALSE,
p_max_linesize IN NUMBER DEFAULT 32000,
p_mode IN VARCHAR2 DEFAULT w
)
RETURN NUMBER
IS
';
$Str =~ s#^(create\s+(?:or\s+replace\s+)?\w+\s+)"[^"]+"."([^"]+)"#$1 $2#mig;
print $Str;