Perl regex logic error - perl

I'm rather new to the concept of Regex. I understand basic regex I use in bash script. The following snippet of code is from a program I'm writing to automatically update Wordpress plugins on the server.
Anyway the concept is that this piece of code is part of a subroutine which recurses through .php files in a directory, and tries to pattern match files starting with "Version:", "version:", "*Version:" etc from the file, and if pattern is matched, another sub then tries to extract the value following the character ":" to get the correct version number.
$searchpath=$path."/".$plugins[$i];
#files = <$searchpath/*.php>;
print "Search path is ".$searchpath."\n";
OUT: foreach $file (#files)
{
print "Checking alternate php file: ".$file."\n";
open(txt, $file);
while($line = <txt>)
{
for ($line)
{
s/^\s+//;
s/\s+$//;
}
if ( $line =~ /^Version:|^version:|^\* Version:|\sVersion:/ )
{
print "Version found in file ".$file."\n";
$varfound=1;
close(txt);
$ver=&read_extract($file);
print $ver."\n";
$pluginversion[$i]=$ver;
print "Array Num ".$i." Stored plugin name:".$plugins[$i]." Version found ".$ver." Version stored ".$pluginversion[$i]."\n";
last OUT;
}
}
}
The issue is that I seem to be having an error in logic, and the file actually matches " . phpversion() . "\n"; Version stored " . phpversion() . "\n" for the search query. With my limited knowledge I find it difficult to understand what's wrong, and would be eager for some advice.
The other subs referred to are included below:
sub read_extract
{
my $pl_version="";
open(txt, my $file=$_[0]);
while($line = <txt>)
{
for ($line)
{
s/^\s+//;
s/\s+$//;
}
if ( $line =~ /^Version:|^version:|^\* Version:|\sVersion:/ )
{
$pl_version=&extract_version($line);
}
}
close(txt);
$pl_version;
}
sub extract_version
{
my $line=$_[0];
$string=substr($line,rindex($line, ":")+1);
for ($string)
{
s/^\s+//;
s/\s+$//;
}
$string;
}
If my subroutine is required in full, I can include it. However my debug lines show this:
Processing xcloner-backup-and-restore...Search path is /var/www/virtual/joel.co.in/vettathu.com/htdocs/wp-content/plugins/xcloner-backup-and-restore
Checking alternate php file: /var/www/virtual/joel.co.in/vettathu.com/htdocs/wp-content/plugins/xcloner-backup-and-restore/admin.cloner.html.php
Checking alternate php file: /var/www/virtual/joel.co.in/vettathu.com/htdocs/wp-content/plugins/xcloner-backup-and-restore/admin.cloner.php
Checking alternate php file: /var/www/virtual/joel.co.in/vettathu.com/htdocs/wp-content/plugins/xcloner-backup-and-restore/admin.xcloner-backupandrestore.php
Checking alternate php file: /var/www/virtual/joel.co.in/vettathu.com/htdocs/wp-content/plugins/xcloner-backup-and-restore/admin.xcloner.php
Checking alternate php file: /var/www/virtual/joel.co.in/vettathu.com/htdocs/wp-content/plugins/xcloner-backup-and-restore/cloner.config.php
Checking alternate php file: /var/www/virtual/joel.co.in/vettathu.com/htdocs/wp-content/plugins/xcloner-backup-and-restore/cloner.cron.php
Checking alternate php file: /var/www/virtual/joel.co.in/vettathu.com/htdocs/wp-content/plugins/xcloner-backup-and-restore/cloner.functions.php
Version found in file /var/www/virtual/joel.co.in/vettathu.com/htdocs/wp-content/plugins/xcloner-backup-and-restore/cloner.functions.php
" . phpversion() . "\n";
Array Num 26 Stored plugin name:xcloner-backup-and-restore Version found " . phpversion() . "\n"; Version stored " . phpversion() . "\n";
which seems to be where the error is.

Well, that's a lot of redundant code there. If you have the line already, why do you need to close the file and find the line again? All you need to do is capture the string when you find the line:
if ( $line =~ /^\*?\s?Version:(.*)/i ) {
my $version = $1;
So, by using the /i modifier, your match is case insensitive. By placing ? after \* and \s they can match 0 or 1 time. By using (.*) the rest of the line is captured to $1.
Your regex was lacking a ^ beginning of line anchor in the last match, which I assumed was a typo. If not, you can simply change the regex to /\bVersion:(.*)/i. And the \b is only useful for avoiding partial matches, such as subversion: foo.

Related

Export and import Data from Quickbase to csv file with perl script

I am try to use Perl language to interact with Quickbase ,I used the below query to export a data table into a text file but I am not getting right format I want, any thoughts? Or if there is another language easier to interact with Quickbase?
#records = $qdb->doQuery($dbid,"{0.CT.''}","6.7.8.9");
$record_count = #records;
foreach $record (#records) {
print MYFILE "|";
foreach $field (keys %$record){
if ($field eq "ColumnA") {
print MYFILE "\"";
print MYFILE " $field : $record->{$field}";
print MYFILE "\"";
}
if ($field eq "ColumnB") {
print MYFILE "\"";
print MYFILE "$field : $record->{$field}";
print MYFILE "\"";
}
if ($field eq "ColumnC") {
print MYFILE "\"";
print MYFILE "$field : $record->{$field}";
print MYFILE "\"";
}
if ($field eq "ColumnD") {
print MYFILE "\"";
print MYFILE "$field : $record->{$field}";
print MYFILE "\"";
}
}
print MYFILE "\n";
}
close LOGFILE;
Wondering, for what kind of answer type do you looking for? But...
I am try to use Perl language to interact with Quickbase,
That's great. Perl is very powerful and suitable for (nearly) any task.
I used the below query to export a data table into a text file
Not very concise code. It is probably a legacy code from an Excel or BASIC person. Some comments:
the code doing the same actions for every field. So, why do you need the if statemenents?
Also, why need break each print into 3 separate prints?
why do you need the | at the beginning of the line?
you probably want to close MYFILE instead of the LOGFILE.
others
it is strange to print to every cell the field_name: field_value, instead of create the column header, but YMMV - so maybe you need this.
it is better to use lexical filehandles, like $myfile instead of the MYFILE
the foreach could be written as for :)
but I am not getting right format I want, any thoughts?
i'm unable to tell anything about the your wanted format, mainly because:
you didn't said anything about what format do you want to get
and, unfortunately, my crystal globe is on the scheduled maintenance. :)
Or if there is another language easier to interact with Quickbase?
Probably not.
The quickbase has an API for the access, (you can learn about it here, and every language (using some libraries) just does the bridge. For the perl it is the HTTP::QuickBase module. Did you read the doc?
Perl is extremely powerful, so anyone can write very concise code. Just need learn the language (as any other one). (Unfortunately, I'am also closer to beginners as experts.)
The above code is could be reduced to:
for my $record ($qdb->doQuery($dbid,"{0.CT.''}","6.7.8.9")) {
print MYFILE '|',
join('|', map {
'"' . $_ . ': ' . $_->{field} . '"'
} keys %$record
), "\n";
}
And will do exactly as the above.
But need to tell, it is still wrong solution. For example:
need cope with the quoting e.g. the "cell content".
but also, the cell contents could contain also the " character, so you need espace them. Here are more escaping techniques for the CSV files, one of is doubling the quote character (usually the "). Or prepend them with \. And much more possible problems, like "new line" characters \n in the cells and so on.
To avoid CSV quoting/escaping hell and other possible problems with CSV generation, you should to use the Text::CSV module. It's been developed in the last 20 years, so it is very long time/hard/stress tested module. You could to use it as:
use Text::CSV;
use autodie;
my $csv = Text::CSV->new ( { sep_char => '|', binary => 1 } ) #if you really want use the '|' instead of the standard comma.
or die "Cannot use CSV: ".Text::CSV->error_diag ();
open $fh, '>', 'some.csv';
$csv->print( $fh, [map { $_->{field} } keys %$_]) for #$records;
close $fh;
Of course, the code is not tested. So, what next?
learn about the quickbase API module
learn about and install the Text::CSV module
read some tutorials and docs about the Perl language itself.

Perl: How to add a line to sorted text file

I want to add a line to the text file in perl which has data in a sorted form. I have seen examples which show how to append data at the end of the file, but since I want the data in a sorted format.
Please guide me how can it be done.
Basically from what I have tried so far :
(I open a file, grep its content to see if the line which I want to add to the file already exists. If it does than exit else add it to the file (such that the data remains in a sorted format)
open(my $FH, $file) or die "Failed to open file $file \n";
#file_data = <$FH>;
close($FH);
my $line = grep (/$string1/, #file_data);
if($line) {
print "Found\n";
exit(1);
}
else
{
#add the line to the file
print "Not found!\n";
}
Here's an approach using Tie::File so that you can easily treat the file as an array, and List::BinarySearch's bsearch_str_pos function to quickly find the insert point. Once you've found the insert point, you check to see if the element at that point is equal to your insert string. If it's not, splice it into the array. If it is equal, don't splice it in. And finish up with untie so that the file gets closed cleanly.
use strict;
use warnings;
use Tie::File;
use List::BinarySearch qw(bsearch_str_pos);
my $insert_string = 'Whatever!';
my $file = 'something.txt';
my #array;
tie #array, 'Tie::File', $file or die $!;
my $idx = bsearch_str_pos $insert_string, #array;
splice #array, $idx, 0, $insert_string
if $array[$idx] ne $insert_string;
untie #array;
The bsearch_str_pos function from List::BinarySearch is an adaptation of a binary search implementation from Mastering Algorithms with Perl. Its convenient characteristic is that if the search string isn't found, it returns the index point where it could be inserted while maintaining the sort order.
Since you have to read the contents of the text file anyway, how about a different approach?
Read the lines in the file one-by-one, comparing against your target string. If you read a line equal to the target string, then you don't have to do anything.
Otherwise, you eventually read a line 'greater' than your current line according to your sort criteria, or you hit the end of the file. In the former case, you just insert the string at that position, and then copy the rest of the lines. In the latter case, you append the string to the end.
If you don't want to do it that way, you can do a binary search in #file_data to find the spot to add the line without having to examine all of the entries, then insert it into the array before outputting the array to the file.
Here's a simple version that reads from stdin (or filename(s) specified on command line) and appends 'string to append' to the output if it's not found in the input. Outuput is printed on stdout.
#! /usr/bin/perl
$found = 0;
$append='string to append';
while(<>) {
$found = 1 if (m/$append/o);
print
}
print "$append\n" unless ($found);;
Modifying it to edit a file in-place (with perl -i) and taking the append string from the command line would be quite simple.
A 'simple' one-liner to insert a line without using any module could be:
perl -ni -le '$insert="lemon"; $eq=($insert cmp $_); if ($eq == 0){$found++}elsif($eq==-1 && !$found){print$insert} print'
giver a list.txt whose context is:
ananas
apple
banana
pear
the output is:
ananas
apple
banana
lemon
pear
{
local ($^I, #ARGV) = ("", $file); # Enable in-place editing of $file
while (<>) {
# If we found the line exactly, bail out without printing it twice
last if $_ eq $insert;
# If we found the place where the line should be, insert it
if ($_ gt $insert) {
print $insert;
print;
last;
}
print;
}
# We've passed the insertion point, now output the rest of the file
print while <>;
}
Essentially the same answer as pavel's, except with a lot of readability added. Note that $insert should already contain a trailing newline.

Perl comparison operation between a variable and an element of an array

I am having quite a bit of trouble with a Perl script I am writing. I want to compare an element of an array to a variable I have to see if they are true. For some reason I cannot seem to get the comparison operation to work correctly. It will either evaluate at true all the time (even when outputting both strings clearly shows they are not the same), or it will always be false and never evaluate (even if they are the same). I have found an example of just this kind of comparison operation on another website, but when I use it it doesn't work. Am I missing something? Is the variable type I take from the file not a string? (Can't be an integer as far as I can tell as it is an IP address).
$ipaddress = '192.43.2.130'
if ($address[0] == ' ')
{
open (FH, "serverips.txt") or die "Crossroads could not find a list of backend servers";
#address = <FH>;
close(FH);
print $address[0];
print $address[1];
}
for ($i = 0; $i < #address; $i++)
{
print "hello";
if ($address[$i] eq $ipaddress)
{print $address[$i];
$file = "server_$i";
print "I got here first";
goto SENDING;}
}
SENDING:
print " I am here";
I am pretty weak in Perl, so forgive me for any rookie mistakes/assumptions I may have made in my very meager bit of code. Thank you for you time.
if ($address[0] == ' ')
{
open (FH, "serverips.txt") or die "Crossroads could not find a list of backend servers";
#address = <FH>;
close(FH);
You have several issues with this code here. First you should use strict because it would tell you that #address is being used before it's defined and you're also using numeric comparison on a string.
Secondly you aren't creating an array of the address in the file. You need to loop through the lines of the file to add each address:
my #address = ();
while( my $addr = <FH> ) {
chomp($addr); # removes the newline character
push(#address, $addr);
}
However you really don't need to push into an array at all. Just loop through the file and find the IP. Also don't use goto. That's what last is for.
while( my $addr = <FH> ) {
chomp($addr);
if( $addr eq $ipaddress ) {
$file = "server_$i";
print $addr,"\n";
print "I got here first"; # not sure what this means
last; # breaks out of the loop
}
}
When you're reading in from a file like that, you should use chomp() when doing a comparison with that line. When you do:
print $address[0];
print $address[1];
The output is on two separate lines, even though you haven't explicitly printed a newline. That's because $address[$i] contains a newline at the end. chomp removes this.
if ($address[$i] eq $ipaddress)
could read
my $currentIP = $address[$i];
chomp($currentIP);
if ($currentIP eq $ipaddress)
Once you're familiar with chomp, you could even use:
chomp(my $currentIP = $address[$i]);
if ($currentIP eq $ipaddress)
Also, please replace the goto with a last statement. That's perl's equivalent of C's break.
Also, from your comment on Jack's answer:
Here's some code you can use for finding how long it's been since a file was modified:
my $secondsSinceUpdate = time() - stat('filename.txt')->mtime;
You probably are having an issue with newlines. Try using chomp($address[$i]).
First of all, please don't use goto. Every time you use goto, the baby Jesus cries while killing a kitten.
Secondly, your code is a bit confusing in that you seem to be populating #address after starting the if($address[0] == '') statement (not to mention that that if should be if($address[0] eq '')).
If you're trying to compare each element of #address with $ipaddress for equality, you can do something like the following
Note: This code assumes that you've populated #address.
my $num_matches=0;
foreach(#address)
{
$num_matches++ if $_ eq $ipaddress;
}
if($num_matches)
{
#You've got a match! Do something.
}
else
{
#You don't have any matches. This may or may not be bad. Do something else.
}
Alternatively, you can use the grep operator to get any and all matches from #address:
my #matches=grep{$_ eq $ipaddress}#address;
if(#matches)
{
#You've got matches.
}
else
{
#Sorry, no matches.
}
Finally, if you're using a version of Perl that is 5.10 or higher, you can use the smart match operator (ie ~~):
if($ipaddress~~#address)
{
#You've got a match!
}
else
{
#Nope, no matches.
}
When you read from a file like that you include the end-of-line character (generally \n) in each element. Use chomp #address; to get rid of it.
Also, use last; to exit the loop; goto is practically never needed.
Here's a rather idiomatic rewrite of your code. I'm excluding some of your logic that you might need, but isn't clear why:
$ipaddress = '192.43.2.130'
open (FH, "serverips.txt") or die "Crossroads could not find a list of backend servers";
while (<FH>) { # loop over the file, using the default input space
chomp; # remove end-of-line
last if ($_ eq $ipaddress); # a RE could easily be used here also, but keep the exact match
}
close(FH);
$file = "server_$."; # $. is the line number - it's not necessary to keep track yourself
print "The file is $file\n";
Some people dislike using perl's implicit variables (like $_ and $.) but they're not that hard to keep track of. perldoc perlvar lists all these variables and explains their usage.
Regarding the exact match vs. "RE" (regular expression, or regexp - see perldoc perlre for lots of gory details) -- the syntax for testing a RE against the default input space ($_) is very simple. Instead of
last if ($_ eq $ipaddress);
you could use
last if (/$ipaddress/);
Although treating an ip address as a regular expression (where . has a special meaning) is probably not a good idea.

Perl Regular Expressions + delete line if it starts with #

How to delete lines if they begin with a "#" character using Perl regular expressions?
For example (need to delete the following examples)
line="#a"
line=" #a"
line="# a"
line=" # a"
...
the required syntax
$line =~ s/......../..
or skip loop if line begins with "#"
from my code:
open my $IN ,'<', $file or die "can't open '$file' for reading: $!";
while( defined( $line = <$IN> ) ){
.
.
.
You don't delete lines with s///. (In a loop, you probably want next;)
In the snippet you posted, it would be:
while (my $line = <IN>) {
if ($line =~ /^\s*#/) { next; }
# will skip the rest of the code if a line matches
...
}
Shorter forms /^\s*#/ and next; and next if /^\s*#/; are possible.
perldoc perlre
/^\s*#/
^ - "the beginning of the line"
\s - "a whitespace character"
* - "0 or more times"
# - just a #
Based off Aristotle Pagaltzis's answer you could do:
perl -ni.bak -e'print unless m/^\s*#/' deletelines.txt
Here, the -n switch makes perl put a loop around the code you provide
which will read all the files you pass on the command line in
sequence. The -i switch (for “in-place”) says to collect the output
from your script and overwrite the original contents of each file with
it. The .bak parameter to the -i option tells perl to keep a backup of
the original file in a file named after the original file name with
.bak appended. For all of these bits, see perldoc perlrun.
deletelines.txt (initially):
#a
b
#a
# a
c
# a
becomes:
b
c
Program (Cut & paste whole thing including DATA section, adjust shebang line, run)
#!/usr/bin/perl
use strict;
use warnings;
while(<DATA>) {
next if /^\s*#/; # skip comments
print; # process data
}
__DATA__
# comment
data
# another comment
more data
Output
data
more data
$text ~= /^\s*#.*\n//g
That will delete all of the lines with # in the entire file of $text, without requiring that you loop through each line of the text manually.

How can i detect symbols using regular expression in perl?

Please how can i use regular expression to check if word starts or ends with a symbol character, also how to can i process the text within the symbol.
Example:
(text) or te-xt, or tex't. or text?
change it to
(<t>text</t>) or <t>te-xt</t>, or <t>tex't</t>. or <t>text</t>?
help me out?
Thanks
I assume that "word" means alphanumeric characters from your example? If you have a list of permitted characters which constitute a valid word, then this is enough:
my $string = "x1 .text1; 'text2 \"text3;\"";
$string =~ s/([a-zA-Z0-9]+)/<t>$1<\/t>/g;
# Add more to character class [a-zA-Z0-9] if needed
print "$string\n";
# OUTPUT: <t>x1</t> .<t>text1</t>; '<t>text2</t> "<t>text3</t>;"
UPDATE
Based on your example you seem to want to DELETE dashes and apostrophes, if you want to delete them globally (e.g. whether they are inside the word or not), before the first regex, you do
$string =~ s/['-]//g;
I am using DVK's approach here, but with a slight modification. The difference is that her/his code would also put the tags around all words that don't contain/are next to a symbol, which (according to the example given in the question) is not desired.
#!/usr/bin/perl
use strict;
use warnings;
sub modify {
my $input = shift;
my $text_char = 'a-zA-Z0-9\-\''; # characters that are considered text
# if there is no symbol, don't change anything
if ($input =~ /^[a-zA-Z0-9]+$/) {
return $input;
}
else {
$input =~ s/([$text_char]+)/<t>$1<\/t>/g;
return $input;
}
}
my $initial_string = "(text) or te-xt, or tex't. or text?";
my $expected_string = "(<t>text</t>) or <t>te-xt</t>, or <t>tex't</t>. or <t>text</t>?";
# version BEFORE edit 1:
#my #aux;
# take the initial string apart and process it one word at a time
#my #string_list = split/\s+/, $initial_string;
#
#foreach my $string (#string_list) {
# $string = modify($string);
# push #aux, $string;
#}
#
# put the string together again
#my $final_string = join(' ', #aux);
# ************ EDIT 1 version ************
my $final_string = join ' ', map { modify($_) } split/\s+/, $initial_string;
if ($final_string eq $expected_string) {
print "it worked\n";
}
This strikes me as a somewhat long-winded way of doing it, but it seemed quicker than drawing up a more sophisticated regex...
EDIT 1: I have incorporated the changes suggested by DVK (using map instead of foreach). Now the syntax highlighting is looking even worse than before; I hope it doesn't obscure anything...
This takes standard input and processes it to and prints on Standard output.
while (<>) {
s {
( [a-zA-z]+ ) # word
(?= [,.)?] ) # a symbol
}
{<t>$1</t>}gx ;
print ;
}
You might need to change the bit to match the concept of word.
I have use the x modifeid to allow the regexx to be spaced over more than one line.
If the input is in a Perl variable, try
$string =~ s{
( [a-zA-z]+ ) # word
(?= [,.)?] ) # a symbol
}
{<t>$1</t>}gx ;