Regular expression handling in elsif block in perl - perl

GMF File:
TSTARTCUSTEVSUMMROW_GPRS
CUSTEVSUMMROW_GPRS GPRS - Subscriber Package (Paygo)|93452|MB|240|33952
CUSTEVSUMMROW_GPRS GPRS - MBB Plan (Paygo)|93452|MB|160|20128
TENDCUSTEVSUMMROW_GPRS
TSTARTCUSTEVSUMMROW_GPRS_SIMPLE
CUSTEVSUMMROW_GPRS_SIMPLE GPRS - LTE Roam Package|1529551|MB|85|260536
CUSTEVSUMMROW_GPRS_SIMPLE GPRS - LTE Roam Package|65461|MB|20000|1309252
TENDCUSTEVSUMMROW_GPRS_SIMPLE
Code:
if ( $line =~ m/^(CUSTEVSUMMROW_SIMPLE|CUSTEVSUMMROW_GPRS_SIMPLE|CUSTEVSUMMROW_GPRS|CUSTEVSUMMROW|CUSTPRODSUMMROW)\s(.*?)\|.*\|(.*?)$/) {
$tag = $1;
$lineTxt = $2;
$amt = $3;
if ( $tag =~ m/^(CUSTEVSUMMROW|CUSTEVSUMMROW_SIMPLE)/ ) {
print "Processing some validations";
} else {
Print " Mapping failed";
} elsif ( $tag =~ m/^(CUSTEVSUMMROW_GPRS|CUSTEVSUMMROW_GPRS_SIMPLE)/ ) {
if () {
#It has to do some validations.
} else {
#Failed;
}
}
}
When I try to process the elseif condition is not able to process. Could you please help me out in solving this issue?
Output:
Unable to map:CUSTEVSUMMROW_GPRS | GPRS - Data Only LTE Package Roaming | 34646.2272
Unable to map:CUSTEVSUMMROW_GPRS | GPRS - LTE Dealer1 Package Roaming | 34609.3312
Unable to map:CUSTEVSUMMROW_GPRS_SIMPLE | GPRS - Simple Subscriber Package 3 | 32.1899
Unable to map:CUSTEVSUMMROW_GPRS_SIMPLE | GPRS - Simple Talk and Text Package | 0.2702

I would recommend a change of approach. Rather than individually matching specific parts of the line, and having to do this over and over again, tokenize it at the start. That is, split it into grammatical pieces. Once the parsing is out of the way, it will be much easier to work with.
An example from English, to parse things like "Go to the store", "You go to the store", "I went to the store", "We are going to the store", you could search for go|going|went at various positions, or you can break it up into subject (go), verb (you), object (store) and then work with them.
It looks like you'e got a | delimited set of fields (your post conflicts on this detail, adjust as necessary). Split on that pipe to tokenize.
my($tag, $description, $amount, $units, $limit, $something) = split m{\|}, $line;
Now you can work with $tag without having to do further parsing on the whole line.
if( $tag eq 'CUSTEVSUMMROW' or $tag eq 'CUSTEVSUMMROW_SIMPLE' ) {
...
}
elsif( $tag eq 'CUSTEVSUMMROW_GPRS' or 'CUSTEVSUMMROW_GPRS_SIMPLE' ) {
...
}
You can make the code simpler by pushing the tag logic into a subroutine.
sub is_tag_of_type {
my($tag, $type) = #_;
return 1 if $type eq 'GPRS' and $tag =~ /GPRS/;
return 1 if $type eq 'SIMPLE' and $tag =~ /SIMPLE/;
...
}
Or maybe the tag has its own little grammar and can be split into tokens.
sub tokenize_tag {
my $tag = shift;
my #tokens = split /_/, $tag;
return map { $ _ => 1 } #tokens;
}
Then your code to process a line looks like this.
my($tags, $description, $amount, $units, $limit, $something) = split m{\|}, $line;
my %tags = tokenize_tags($tags);
if( $tags{GPRS} ) {
...
}
else {
...
}

Writing if ... else ... elsif is a syntax error -- your code won't even run. Assuming that the elsif should be between the if and the else you have another problem: the regex in the if condition is more general than the one in the elsif condition. CUSTEVSUMROW will match anything that CUSTEVSUMMROW_GPRS or CUSTEVSUMMROW_GPRS_SIMPLE would. Swap the if and elsif blocks so that the specific check happens before the general one.
if ($tag =~ /^CUSTEVSUMMROW_GPRS/) {
...
}
elsif ($tag =~ /^CUSTEVSUMMROW/) {
...
}
else {
...
}

Related

compare the value of same items with one exception case

A,food,75
B,car,136
A,car,69
B,shop,80
A,house,179
B,food,75
C,car,136
ECX5,flight,50
QC4,train,95
C,food,85
B,house,150
D,shop,80
EAX5,flight,50
QA4,train,75
F,movie,
It should do comparison between the values of same type (wherever the 2nd column matches) and Print the differ .Now i want output to look like:
**A,food,75 is not matching with B,food,75 C,food,85
A,car,69 is not matching with C,car,136 B,Car,136
A,house,179 is not matching with B,house,150
QC4,train,95 is not matching with QA4,train,75
F,movie missing value
Code I've written is below but its not printing the format the way I want.
while (FILE) {
my $line = $_ ;
my #lines = split /,/, $line ;
$data{$lines[1]}{$lines[0]} = $lines[2] ;
}
foreach my $item (keys %val) {
foreach my $letter1 (keys %{$val{$item}}) {
foreach my $letter2 (keys %{$val{$item}}) {
if ( ($val{$item}{$letter1} != $val{$item}{$letter2}) && ($letter1 ne
$letter2) && ( (!defined $done{$item}{$letter1}{$letter2}) ||
(!defined
$done{$item}{$letter2}{$letter1}) ) ) {
print "$item : $letter1, $val{$item}{$letter1}, $letter2 ,
$val{$item}
{$letter2}\n" ;
}
}
Really hard to follow the logic of your code.
But I seem to get the desired result with this:
[Edit] The code was edited as per the comments
use strict;
use warnings;
my (%hash,
);
while(my $line=<DATA>) {
chomp $line;
my ($letter, $object, $number)=split /,/, $line;
### here we are dealing with missing values
if ($number =~ m{^\s*$}) {
print $line, "missing value\n";
next;
}
### here we dissever exceptional items apart from the others
if ($letter=~m{^E[AC]X\d$}) {
$object = "exceptional_$object";
}
$number+=0; # in case there is whitespace at the end
push #{$hash{$object}{$number}}, [$letter,$number,$line];
}
for my $object(sort keys %hash) {
my $oref = $hash{$object};
if (1==keys %$oref) {
next;
}
my $str;
for my $item (values %$oref) {
$str .= $str ? " $item->[0][2]" : "$item->[0][2] is not matching with";
}
print ($str,"\n");
}
__DATA__
A,food,75
B,car,136
A,car,69
B,shop,80
A,house,179
B,food,75
C,car,136
ECX5,flight,50
ECX4,train,95
C,food,85
B,house,150
D,shop,80
EAX5,flight,50
EAX4,train,75
F,movie,
output
F,movie,missing value
A,car,69 is not matching with B,car,136
EAX4,train,75 is not matching with ECX4,train,95
C,food,85 is not matching with A,food,75
A,house,179 is not matching with B,house,150
What the algorithm does:
Looping through the input we remember the all the lines for each unique pair of object and number.
After going through the input loop we do the following:
For each object we skip it if it has no different numbers:
if (1==keys %$oref) {
next
}
If it has, we build an output string from a list of the first remembered lines for that object and number (that is we omit the duplicates for the object and number);
the first item from the list amended with "is not matching with".
Also, I am reading from the special filehandle DATA, which accesses embedded data in the script. This is for convenience of demoing the code

Extra HASH() reference added to Perl hash output

I'm trying to read a FORTRAN program using Perl, and remove an INCLUDE command and replace it with a USE.
This is working great, except when printing out the contents of the hash storing the existing USE statements, I get an extra entry.
my #lines = ( );
my %uses = { };
foreach $f1line (<>) {
$f1line =~ s/\r[\n]*//g;
if ($f1line =~ /^\s*INCLUDE 'FILE1.CMN'/ ||
$f1line =~ /^\s*INCLUDE 'FILE2.CMN'/ ||
$f1line =~ /^\s*INCLUDE 'FILE3.CMN'/) {
$f1line = " USE My_Mod";
}
if ($f1line =~ /^\s*USE /) {
$uses{$f1line} = 1;
}
push #lines, $f1line . $/;
}
$found = 0;
foreach $line (#lines) {
if ($found == 0 && $line =~ /^\s*USE /) {
foreach my $x (sort(keys(%uses))) {
print $x . $/; # (1)
}
$found = 1;
} elsif ($found == 1 && $line =~ /^\s*USE /) {
next;
} else {
print $line;
}
}
The output is this:
C Include parameters here
USE My_Mod
USE MyOther_Mod
USE EvenAnother_Mod
HASH(0x7f9dc3805ce8)
Where is the HASH(0x...) reference coming from? The only place I'm printing the contents of the hash is on line (1). It almost looks like iterating over the keys of the hash is somehow including the hash itself.
How can I get rid of this?
You are not really having a big problem, the big deal here is that you are not able to see the errors you are doing.
That's why you should always strict and warnings
In your code you start with:
my %uses = { };
When it should be:
my %uses = ();
or
my %uses; #it's fine also
And then it will works.
By using {} in a "hash" context you could create a hashref which is not the case.
A reference to an anonymous hash can be created using curly brackets:
$hashref = {
'Adam' => 'Eve',
'Clyde' => 'Bonnie',
};
Also is a good practice declare your variables in foreach loop like:
foreach my $line (#lines) {
And in the rest of your code.

How to extract some specific information from a file in perl?

## some lines
## cell (a) { area : 0.898; power: 0.867;
....(some parameters values)
}
pin(a1) { power: 0.767; (some more parameters specific to pins)
timing() {
## again some parameters value....
}
My file contains approx 300 such cells and these cells are in between the files. I want to parse the file and what to know all the variable parameters, i tried following code but it is of no use
while (defined($line=<$fh>)) {
if ($line =~ /cell \(\w+\) \{/../cell \(\w+\) \{/) {
print $result "$line \n";
}
}
I want to get the values inside { } also but , dont know how to get as i have parenthesis inside parenthesis in my code. Please help.
Thank u all for the help..I wrote a code to take into account of scalar attributes(ignoring all the attributes inside parenthesis.) BUt I am facing a very weird problem. I am facing problem with if ($line =~ /cell (\w/../cell (\w/) in my code. For the first file, it detects the line which has cell ( field and starts from there, but for the second file it starts from the first line itself.
open $result_file1, ">", "file1.txt";
open $result_file2, ">", "file2.txt";
open $fl1, $file1; open $fl2, $file2;
sub file_reader {
($fh, $indx) = #_;
$count = 0;
undef #temp; undef #pram;
while (defined($line=<$fh>)) {
if ($line =~ /cell \(\w/../cell \(\w/) {
if ($indx == "1") {print $result_file1 "$line\n";}
if ($indx == "2") {print $result_file2 "$line\n";}
if ($line =~ /cell \(\w/) {
#temp = split (' ', $line);}
if ($line =~ /\{/) {
$count += 1;}
if ($line =~ /\}/) {
$count = $count - 1; }
if (($line =~ /:/) and ($count == 1)) {
#pram = split (':', $line);
if ($indx == "1") {$file1{$temp[1]}{#pram[1]} = #pram[2];}
elsif ($indx == "2") { $file2{$temp[1]}{#pram[1]} = #pram[2];}
} }}
close $fh;}
file_reader($fl1, "1");
file_reader($fl2, "2");
A piece of output of file1 :
cell (AND2X1) {
cell_footprint : "AND2X1 ";
area : 7.3728 ;
cell_leakage_power : 3.837209e+04;
driver_waveform_rise : "preDrv";
driver_waveform_fall : "preDrv";
pg_pin (VDD) {
voltage_name : "VDD";
pg_type : "primary_power";
}
pg_pin (VSS) {
voltage_name : "VSS";
pg_type : "primary_ground";
}
.......
A piece of output of file2:
/**********************************************************************
**** ****
**** The data contained in the file is created for educational ****
**** and training purposes only and are not recommended ****
**** for fabrication ****
**** ****
***********************************************************************
**** ****
Why it is not able to apply that range if condition for my second file?
I have to guess at your input data and therefor am not very sure about the problem and the goal.
But anyway, try changing your line
if ($line =~ /cell \(\w/../cell \(\w/) {
to
if ($line =~ /cell \(\w/.. $line =~/cell \(\w/) {
Otherwise the second regex will be matched against the uninitialised "$_".
I found this out by
use strict;
use warnings;
which is one of my favorite tools.
By the way, you made me aware of this use of the range operator, which I find fascinating. Thanks.

what aren't I getting here?

This one really has me confused and I don't know how to accurately title it.
I am writing a program, the purpose is irrelevant, but some of you may know as I've been asking a few questions about it recently.
I'm going to post the entire program but I don't think that's necessary. The part you need to look at is the nested loops where it says "beginning search algorithm."
The program takes a long time to complete, so after every iteration of the outermost loop, I print a '.'. after every 7 dots a new line is printed.
for some reason, however no dots will get printed, until a newline is printed.
heres the code:
#!/usr/bin/perl
use v5.14;
use warnings;
# this is a cgi implementation of a theorum proover.
# the program uses resolution refutation, using a breadth-first and set of support strategy
# to generate a proof(if possible) and relay the results to the user.
########################################################################################
#Algorithm:
#1.) Get size(i) of knowledge base
#2.) untill you have i clauses
# 3.) get the clause, add to knowledge base
#4.) get the conclusion variable(conjecture)
#5.) add the negation of the conjecture to the knowledge base
#6.) add the negation of the conjecture to the SOS set.
#7.) compare the SOS set to ever other clause
# 8.) if resolution is possible, add the new clause to the knowledge base if it does not already exist.
# 9.) add the new clause to the SOS set.
#10.) repeat 7-9 untill the null clause is generated or no more resolution is possible.
########################################################################################
my $conclusion;
my $conclusion2;
my #conclusion;
my #SOS;
my #clauses;
my $found=0;
#batch mode
if($ARGV[0])
{
my $filename = $ARGV[0];
open(IN, "<", $filename);
chomp(#clauses=<IN>);
close(IN);
for(#clauses)
{
$_ =~ s/[^A-Za-z~,]//g;
}
#negate the negation to get the desired conclusion for later
$conclusion2=$clauses[$#clauses];
print "$conclusion2";
#conclusion = split("", $conclusion2);
if($conclusion[0] eq '~')
{
splice(#conclusion, 0, 1);
$found=1;
}
if (!$found)
{
$conclusion = "~$conclusion2";
}
else
{
$conclusion = join("", #conclusion);
}
#now break up each line and make #clauses 2d
$_ = [split /,/ ] for #clauses;
}
#interactive mode
else
{
my $count=0;
say "Welcome to my Theorum Proover!";
say "How many clauses are in your knowledge base?";
say "(this does not include the conclusion)";
print "Amount: ";
my $amt = <>;
say "Enter your clauses: ";
say "Negations can be indicated with a '~'.";
say "Variable names must contain only letters.";
say "Separate each literal with a ','<br>";
my $clauses;
while($count < $amt)
{
print "clause $count:";
$clauses .= <>;
$clauses =~ s/[^A-Za-z~,]//g;
$clauses .= ";";
$count++;
print "\n";
}
print "\n \n \n Enter the conclusion, your conclusion should be a literal:";
$conclusion = <>;
$conclusion =~ s/[^A-Za-z~]//g;
print "\n";
#negate the conclusion and add it to the set of clauses.
#conclusion = split("", $conclusion);
if($conclusion[0] eq '~')
{
splice(#conclusion, 0, 1);
$found=1;
}
if (!$found)
{
$conclusion2 = "~$conclusion";
}
else
{
$conclusion2 = join("", #conclusion);
}
# split up the contents of the clause string and add them to a 2d array.
#then, add the negated conclusion to the list.
my #PartClauses= split(';', $clauses);
my $last=#PartClauses;
for my $i (0 .. $#PartClauses)
{
my #tmp=split(',', $PartClauses[$i]);
for my $j (0 .. #tmp)
{
$clauses[$i][$j] = $tmp[$j];
}
}
$clauses[$last][0] = $conclusion2;
}
open(RESULTS, ">", 'results.txt');
for my $i (0 .. $#clauses)
{
print RESULTS "clause $i: {";
for my $j (0 .. $#{ $clauses[$i] })
{
print RESULTS "$clauses[$i][$j]";
if($j != $#{ $clauses[$i] })
{
print RESULTS ",";
}
}
print RESULTS "}\n";
}
print RESULTS "_____________________________\n";
print "Beginning search ....";
##################################################
#begin breadthfirst/sos search/add algorithm
$SOS[0][0]=$conclusion2;
my $cSize=$#clauses;
say "\nworking......";
my $sAdd=0;
my $cAdd=0;
my $res=0;
my $flag=0;
my $dots=0;
SOSROW:
for (my $a=0; $a<=$#SOS; $a++)
{
&update;
CLAUSEROW:
for (my $i=0; $i<=$#clauses; $i++)
{
SOSCOL:
for (my $b=0; $b<=$#{ $SOS[$a] }; $b++)
{
CLAUSECOL:
for my $j (0 .. $#{ $clauses[$i] })
{
if($SOS[$a][$b] eq "~$clauses[$i][$j]"
|| $clauses[$i][$j] eq "~$SOS[$a][$b]")
{
my #tmp;
#found a resolution, so add all other literals from
#both clauses to each set as a single clause
#start with the SOS literals(use a hash to keep track of duplicates)
my %seen;
for my $x (0 .. $#{ $SOS[$a] })
{
if($x != $b)
{
$seen{$SOS[$a][$x]}=1;
push #tmp, "$SOS[$a][$x]";
}
}
#now add the literals from the non-SOS clause
for my $y (0 .. $#{ $clauses[$i] })
{
if($y != $j)
{
if(! $seen{ $clauses[$i][$y] })
{
push(#tmp, "$clauses[$i][$y]");
}
}
}
#check to see if the clause is already listed
my $dupl = 0;
my #a1 = sort(#tmp);
my $s1 = join("", #a1);
MATCH:
for my $i (0 .. $#clauses)
{
my #a2= sort(#{ $clauses[$i] });
my $s2= join("", #a2);
if($s1 eq $s2 )
{
$dupl = 1;
last MATCH;
}
}
#if it isn't, go ahead and add it in
if(! $dupl)
{
$res++;
$sAdd++;
$cAdd++;
my $s = $cSize + $cAdd;
push(#SOS, \#tmp);
push(#clauses, \#tmp);
#print out the new clauses.
print RESULTS"clause $s: ";
my $clause = $cSize+$a;
print RESULTS "{";
if($SOS[$sAdd][0])
{
for my $j(0 .. $#{ $clauses[$s] })
{
if($clauses[$s][$j])
{
print RESULTS "$clauses[$s][$j]";
}
if($j!= $#{ $clauses[$s] })
{
print RESULTS ",";
}
}
print RESULTS "} ($i,$clause)\n";
}
#if you found a new res, but there was nothing to push, you found
# the contradiction, so signal and break.
else
{
print RESULTS "} ($i,$clause)\n";
$flag=1;
last SOSROW;
}
}
}
}
}
}
}
close(RESULTS);
if($flag)
{
say "After $res resolutions, a resolvent was found and the empty set was generated.";
say "This indicates that when '$conclusion' is false, the entire knowledge base is false.";
say "Because we know that the clauses in the knowledge base are actually true, we can soundly conclude that '$conclusion must also be true.";
say "The clauses generated by each resolution can be found below.\n\n";
}
else
{
say "We were not able to generate the empty clause.";
say "this means that adding the negation of the desired conclusion does not render the theorum false.";
say "Therefore, we can not safely conclude that '$conclusion' is true.";
say "Any clauses that we were able to generate through a resoluton can be viewed below.\n\n";
}
print `more results.txt`;
sub update
{
if((($dots % 7) == 0))
{
print "\n";
}
if($dots==14)
{
print "You might want to get some coffee.\n";
}
if($dots==35)
{
print "I'm being VERY Thorough.\n";
}
if($dots==63 || $dots==140)
{
print "Hows that coffee?\n";
}
if($dots==105)
{
print "I think it might be time for a second cup of coffee\n"
}
if($dots==210)
{
print "Like I said, VERY thorough\n";
}
if($dots==630)
{
print "My O is bigger than you can imagine\n"
}
$dots++;
print ".";
}
I can't figure out why this is happening. could it have something to do with buffering?
If instead of calling the subroutine, i just say print "."; nothing will be printed until, the prog finishes execution.
Yes, filehandles are buffered by default. If STDOUT points to a terminal it will be line-buffered (nothing is output until a newline is printed), otherwise it will be block-buffered (nothing is output until a certain number of bytes is printed). The easiest way to change that is to set $|=1, which will make the current output filehandle (usually STDOUT unbuffered), so it will flush after every print.

Parsing through a file using perl and string concatenation

I'm trying to parse through the file and collecting types of mismatches in the different modules and generating an xls. Below is the report pattern i need to parse( but actual report pattern is not simple as below):
outsocket ports in design1 not in design2
a
b
Insocket ports in design1 not in design2
g
h
There can be many design name(pushed to module list using $module) in the log but this pattern will not change.(Insocket/outsocket/othertype ports in <> not in <>)
Below is my core code.I'm facing issue with this code and it is not working(especially string concatenation) as expected please help me fix this.
while ($line = <FH>) {
if ( $line =~ /Insocket(\s*)ports(\s*)in(\s*)${design1}(\s*)not(\s*)in(\s*)${design2}/) {
$mismatch_type = "type_i_n";
}
elsif ($line =~ /Insocket(\s*)ports(\s*)in(\s*)${design2}(\s*)not(\s*)in(\s*)${design1}/) {
$mismatch_type = "type_i_r";
}
elsif ( $line =~ /outsocket(\s*)ports(\s*)in(\s*)${design2}(\s*)not(\s*)in(\s*)${design1}/ ) {
$mismatch_type = "type_o_n";
}
elsif ( $line =~ /outsocket(\s*)ports(\s*)in(\s*)${design1}(\s*)not(\s*)in(\s*)${design2}/ ) {
$mismatch_type = "type_o_r";
}
$result = $mismatch_type . "_code_ " . $module;
$$result = $$result .$line;
if(!present_in_list ($module,#module_list)) {
push #module_list,$module;
}
push #mismatch_type_list,$mismatch_type;
}#PARSING OF FILE ENDS
#NOW PROCESSING THE RESULTS BASED ON PARSING
foreach $module (#module_list) {
foreach $each_mismatch (#mismatch_type_list) {
$result = $mismatch_type . "_code_ " . $module;
print FH2" $$result,";
}
}
Here are a couple of guesses and general suggestions. I'm not sure this will fix your problem, but it will probably help:
The first two regexes have an extra } in ${design1}} and ${design2}}. Not sure if this is present in your original program or if it's an error that was introduced in posting here.
It looks like you're using string concatenation to build up the name of a $result variable, so you can then use that string as the name of another variable like $type_i_n_code_module_a. I strongly recommend that to reduce confusion, that you use a hash instead:
%result = ();
while ($line = <FH>) {
...
$result_key = $mismatch_type . "_code_ " . $module;
$result{$result_key} = $result{$result_key} . $line;
}
...
foreach $module (#module_list) {
foreach $each_mismatch (#mismatch_type_list) {
$result_key = $mismatch_type . "_code_ " . $module;
print FH2" $result{$result_key},";
}
}