I don't understand how preg_replace_callback works, how to update from preg_replace - preg-replace

How it will look this using preg_replace_callback?
$str = preg_replace('/\&\#([0-9]+)\;/me', "code2utf('\\1',{$lo})", $str);

If I am not mistaken you want to use preg_replace_callback instead of using the /e modifier.
If you want to pass extra parameters to the callback function you could make use of the use indentifier or wrap the callback in another function.
The second example could look like:
$str = preg_replace_callback(
'/\&\#([0-9]+)\;/m', function ($matches) use ($lo) {
// function body with return statement
}, $str
);
Notes
Your regex \&\#([0-9]+)\; will match a string like 𸽡. I think you don't have to ecape the & and #.
In your code you use return strtoupper($matches[1], $lo); but strtoupper takes one parameter instead of 2 parameters.
If this is what you want to match, then when running your code you could see that $matches[1] contains "233333" so this will be called return strtoupper("233333");

Related

How to do multiple substitution using perl scripts

I have a bunch of scripts I wanted to replace some texts.
Context : We are using selenium for UI Automation. We used to store the references to the UiElements in a map. But we are now moving to use PageFactory (a class with all the UiElements declared as a string)
So when we used map, we had to call the UIelements as objectMap.getIdentifier("navigate.leftsidebar"). But now with PageFactory (its like instantiating a object, and the UIElements are declared as a string), I can access these UIelements objectPageFactory.navigate_leftsidebar (here navigate_leftsidebar is a String)
So I will need to go modify all my existing scripts. Is there a way I can write a script to get this replaced, than doing it manually ?
Below are the 3 scenarios that I will encounter :
Click(getElement(objectMap.getIdentifier("navigate.leftsidebar").replace("$Page", "Inbox")), "clicking on an element");
objectMap.getIdentifier("navigate.leftsidebar")
Click(objectMap.getIdentifier("navigate.leftsidebar"), "clicking on an element");
This is the expected output:
Click((objectPageFactory.navigate_leftsidebar("Inbox")), "clicking on an element");
objectPageFactory.navigate_leftsidebar
Click(objectPageFactory.navigate_leftsidebar, "clicking on an element");
Changes are :
"objectMap" to be renamed as "objectPageFactory"
There could be different types of map. if objectMap , it should be replaced as objectPageFactory; if loginMap, it should be changed as loginPageFactory
objectMap.getIdentifier("navigate.leftsidebar") >>>> objectFactory.navigate_leftsidebar (the String literal inside the bracket is separated by underscore instead of dots
getElement is not needed now
we used to have some dynamic UiElements (navigate.leftsidebar in this case), for which we used to call String.replace, now we are writing functions which will internally do a String.format
getElement(objectMap.getIdentifier("navigate.leftsidebar").replace("$Page", "Inbox")) >>>>> objectPageFactory.navigate_leftsidebar("Inbox")
I got a perl script from this link, which will do partial job sed command to replace dots.
I just need to add the different scenarios to this, is there a way ? the output should now have a pageFactory text too, based on which map
#! /usr/bin/perl
use strict ;
sub fix { $_ = shift ; s/"//g ; s/\./_/g ; return $_ }
while ( <> ) {
s/getElement\(objectMap\.getIdentifier\(("?[a-z.]+"?)\)/fix($1)/e ;
s/objectMap\.getIdentifier\(("?[a-z.]+"?)\)/fix($1)/e ;
print
}
This seems to provide the output you requested. I don't understand the language you're changing, so there might be corner cases it processes wrong. Make a backup before you change the files!
#!/usr/bin/perl
use warnings;
use strict;
sub fix {
my ($id) = #_;
return $id =~ s/[.]/_/gr
}
while (<>) {
s{getElement\((object|login)Map\.getIdentifier\("([^"]*)"\)\.replace\("\$Page", "([^"]*)"\)\)}
{"$1PageFactory." . fix($2) . qq(("$3"))}ge;
s{(object|login)Map\.getIdentifier\("([^"]*)"\)}
{"$1PageFactory." . fix($2)}ge;
print;
}

TreeModelFilter in GTK/Perl - Question about set_visible_func

I am trying to filter a liststore using the GTK2::TreeModelFilter. I can't seem to find an example online that uses perl and I am getting syntax errors. Can someone help me with the syntax below? The $unfiltered_store is a liststore.
$filtered_store = Gtk2::TreeModeFilter->new($unfiltered_store);
$filtered_store->set_visible_func(get_end_products, $unfiltered_store);
$combobox = Gtk2::ComboBoxEntry->new($filtered_store,1);
Then somewhere below:
sub get_end_products {
my ($a, $b) = #_;
warn(Dumper(\$a));
warn(Dumper(\$b));
return true; # Return all rows for now
}
Ultimately what I want to do is look at column 14 of the listore ($unfiltered_store) and if it is a certain value, then it filtered into the $filtered_store.
Can someone help me with the syntax on this? I checked a bunch of sites, but they're in other languages and using different syntax (like 'new_filter' -- doesn't exist with Perl GTK).
This is the most elegant solution to a fix I need to make and I would prefer to learn how to use this rather than using a brute force method of pulling and saving the filtered data.
The set_visible_func method of the filtered store should get a sub reference as the first argument, but you are not passing a sub reference here:
$filtered_store->set_visible_func(get_end_products, $unfiltered_store);
This will instead call the sub routine get_end_products and then pass on its return value (which is not a sub reference). To fix it add the reference operator \& in front of the sub name:
$filtered_store->set_visible_func(\&get_end_products, $unfiltered_store);
Regarding your other question in the comments:
According to the documentation the user data parameter is passed as the third parameter to get_end_products, so you should define it like this:
sub get_end_products {
my ($model, $iter, $user_data) = #_;
# Do something with $user_data
return TRUE;
}
If for some reason $unfiltered_store is not passed on to get_end_products, you can try pass it using an anonymous sub instead, like this:
$filtered_store->set_visible_func(
sub { get_end_products( $unfiltered_store) });

how to find multiple regex patterns in a single way using Perl

Question Updated
I have list of (few more) regex patterns like: (Note: Sequence is very Important)
([a-z]+)(\d+)
\}([a-z]+)
([a-z]+)(\+|\-)
([0-9])\](\+|\-)
...
...
my input file like :
\ce{CO2}
\ce{2CO}
\ce{H2O}
\ce{Sb2O3}
...
...
In my code I am finding the each and every regex patterns like
if($string=~m/([a-z]+)(\d+)/g) { my statements ... }
if($string=~m/\}([a-z]+)/g) { my statements ... }
if($string=~m/([a-z]+)(\+|\-)/g) { my statements ... }
if($string=~m/([0-9])\](\+|\-)/g) { my statements ... }
Instead of doing the above code Is there any other way to simplify the code?
Could you someone please share your thoughts for my improvement for better coding.
Disclaimer: Your question is very hard to read, so this is pretty much guesswork. I am not sure I understand what you want to do.
When you are processing data in a dynamic way, a typical approach is to use a dispatch table. We can do something similar here. Often a hash or hash reference is used for that, but since we want a specific order, I will be using an array instead.
my #dispatch = (
{
pattern => qr/f(o)(o)/,
callback => sub {
my ($one, $two) = #_;
print "Found $one and $two\n";
},
},
{
pattern => qr/(bar)/,
callback => sub {
my $capture = shift;
print "Saw $capture";
},
},
);
This basically is a list of search patterns and associated instructions. Each pattern has a callback, which is a code reference. I decided it would make sense to pass in the capture variables, because your patterns have capture groups.
Now in order to call them, we iterate over the dispatch array, match the pattern and then call the associated callback, passing in all the captures.
my $text = "Foo bar foo bar baz.";
foreach my $search (#dispatch) {
if ($text =~ $search->{pattern}) {
$search->{callback}->(#{^CAPTURE}); # this requires Perl 5.26
}
}
Please note that I am using #{^CAPTURE}, which was added to Perl in version 5.25.7, so you would require at least the stable Perl 5.26 release to use it. (On an older Perl, my #capture = $t =~ $search->{pattern} and $search->{callback}->(#capture) will behave similarly).
This is way more elegant than having a list of if () {} statement because it's very easy to extend. The dispatch table could be created on the fly, based on some input, or entirely read from disk.
When we run this code, it creates the following output
Found o and o
Saw bar
This is not very spectacular, but you should be able to adapt it to your patterns. On the other hand I don't know what you are actually trying to do. If you wanted to modify the string instead of only matching, you might need additional arguments for your callbacks.
If you want to learn more about dispatch tables, I suggest you read the second chapter of Mark Jason Dominus' excellent book Higher Order Perl, which is available for free as a PDF on his website.
Your question is hard to read, mainly because you have the /g at the end of your regex searches (which returns a list), however, you only check if it matches once.
I'm making the following assumptions
All matches are required
the code can be a single or double match
both groups captured in one line
i think you want
while ( $string =~ /(([a-z]+)(\d+)|\}([a-z]+)|([a-z]+)(\+|\-)|([0-9])\](\+|\-))/g )
{
#$1 has the whole match
#$2 has the first group if defined
#$3 has the second group if defined
}
However, I prefer the method below. this will capture in one line
while ($string =~ /([a-z]+\d+|\}[a-z]+|[a-z]+\+|\-|[0-9]\]\+|\-)/g )
{
# in here split the match if required
}
I recommend you use regex comments to make this clearer.
if you just want a single match, use
if(
$string=~m/([a-z]+)(\d+)/ ||
$string=~m/\}([a-z]+)/ ||
$string=~m/([a-z]+)(\+|\-)/ ||
$string=~m/([0-9])\](\+|\-)/
)
{
#some code
}

Can I have a string containing a delegate that is evaluated at runtime?

Can I have a string that contains a delegate that gets expanded at various times during runtime?
$pattern = "(?m)^INFO\:(?:\s|\t)*$({script:$marker})\:(?:\s|\t)*(?<url>.*)$"
$marker = "Some marker value"
:
#Do something with the resulting pattern containing the marker value
:
$marker = "Some other marker value"
:
#Do something with the pattern having the new marker value
and so on... I'd prefer not to have to keep redefining the string... or having a function that builds it. It seems so much more succinct if I could just have a few characters in the string that get evaluated when the string is needed vs. when the $pattern value is set.
you can do
$pattern = {"(?m)^INFO\:(?:\s|\t)*($script:marker)\:(?:\s|\t)*(?<url>.*)$"}
and then later use
$pattern.invoke()
(Assuming you want $script:marker to be the characters that get set later, your original example has $({script:$marker}), but that won't work if it is supposed to do what I think it should ;))
In general: Define the term as Scriptblock using {} and later .invoke() to evaluate it.
Just make sure there is no confusion about the types within the curly brackets, otherwise you might get some strange results...

What does "Useless use of a variable in void context" mean in this Perl script?

The following script gives me what I want but Perl also throws me a warning saying "Useless use of a variable in void context". What does it mean?
use strict;
use warnings;
my $example = 'http\u003a//main\u002egslb\u002eku6\u002ecom/c0/q7LmJPfV4DfXeTYf/1260269522170/93456c39545857a15244971e35fba83a/1279582254980/v632/6/28/a14UAJ0CeSyi3UTEvBUyMuBxg\u002ef4v\u002chttp\u003a//main\u002egslb\u002eku6\u002ecom/c1/q7LmJPfV4DfXeTYf/1260269522170/3cb143612a0050335c0d44077a869fc0/1279582254980/v642/10/20/7xo2MJ4tTtiiTOUjEpCJaByg\u002ef4v\u002chttp\u003a//main\u002egslb\u002eku6\u002ecom/c2/q7LmJPfV4DfXeTYf/1260269522170/799955b45c8c32c955564ff9bc3259ea/1279582254980/v652/32/4/6pzkCf4iqTSUVElUA5A3PpMAoA\u002ef4v\u002chttp\u003a//main\u002egslb\u002eku6\u002ecom/c3/q7LmJPfV4DfXeTYf/1260269522170/cebbb619dc61b3eabcdb839d4c2a4402/1279582254980/v567/36/19/MBcbnWwkSJu46UoYCabpvArA\u002ef4v\u002chttp\u003a//main\u002egslb\u002eku6\u002ecom/c4/q7LmJPfV4DfXeTYf/1260269522170/1365c39355424974dbbe4ae8950f0e73/1279582254980/v575/17/15/EDczAa0GTjuhppapCLFjtaQ\u002ef4v';
my #raw_url = $example =~ m{(http\\u003a.+?f4v)}g;
my #processed_url = map {
s{\\u003a}{:}g,$_;
s{\\u002e}{.}g,$_;
s{\\u002d}{#}g,$_;
} #raw_url;
print join("\n",#processed_url);
And why this map thing doesn't work if I omit those dollar underscores like so?
my #processed_url = map {
s{\\u003a}{:}g;
s{\\u002e}{.}g;
s{\\u002d}{#}g;
} #raw_url;
When I omit those dollar underscores, I get nothing except for a possibly success flag "1". What am I missing? Any ideas? Thanks like always :)
What you want is...
my #processed_url = map {
s{\\u003a}{:}g;
s{\\u002e}{.}g;
s{\\u002d}{#}g;
$_;
} #raw_url;
A map block returns the value composed of the last statement evaluated as its result. Thats why we pass the $_ as the last statement. The substitution operator s{}{} returns the number of substitutions made.
In your prior setup, you had by itself the following statement. Which is pretty much meaningless and that is what Perl is warning about.
s{\\u003a}{:}g, $_;
You already have the answer you were looking for, but I wanted to point out a subtlety about using the substitution operator inside a map block: your original array is also being modified. If you want to preserve the original array, one way to do it is to make a copy of the array, then modify only the copy:
my #processed_url = #raw_url;
for (#processed_url) {
s{\\u003a}{:}g;
s{\\u002e}{.}g;
s{\\u002d}{#}g;
}
Or, if you only need one array, and you want the original to be modified:
for (#raw_url) {
s{\\u003a}{:}g;
s{\\u002e}{.}g;
s{\\u002d}{#}g;
}