Preg_replace in php - multiple replace - preg-replace

I have $string, which contains:
this example
and I have these 3 expressions:
$pattern = array('/aa*/','/ii*/');
$replacement = array('<i>$0</i>','<b>$0</b>');
preg_replace($pattern, $replacement, $string);
where, preg_replace returns:
th<b>i</b>s ex<<b>i</b>>a</<b>i</b>>mple
and I need output like this:
th<b>i</b>s ex<i>a</i>mple
which means, that I want to replace only characters in original string. Is it possible?

This does the trick in my testing
$pattern = array('/([a-z|^|\s])(aa*)/', '/([a-z|^|\s])(ii*)/');
$replacement = array('$1<i>$2</i>','$1<b>$2</b>');

Related

perl extract string and scientific number

I have data in particular format.
capacitor #(.c(3.58782e-14)) c_1310 (vsub, vss_res);
I want to extract those highlighted in BOLD from the data set. I tried using regex
$cap = $line =~ /([0-9]*\.?[0-9]+([eE][-]?[0-9]+)?)/ ;
($net1, $net2) = $line =~ /\(([A-Za-z0-9_]*) \, ([A-Za-z0-9_]*)\)/ ;
$line contains each data line. Need help in getting the regex corrected.
I have a solution using split() function but regex would be better I think.
Assuming that the format of data is always the same, something like this should work
my $line = 'capacitor #(.c(3.58782e-14)) c_1310 (vsub, vss_res);';
my ($net1, $net2, $net3) = $line =~ /\(.+\((.+)\)\)\s+(.+)\s+\((.+)\)/;
The original post seemed to do some checking and validation (in contrast with matching '.' which matches anything) and I will suggest a more validating version here:
use Modern::Perl;
use Regexp::Common;
my $line = 'capacitor #(.c(3.58782e-14)) c_1310 (vsub, vss_res);';
my ($cap, $cap_no, $net1, $net2) = $line =~ /
\([^(]+\( ($RE{num}{real}) \)\)
\s+(\w+)\s+
\(
(\w*) ,\s*
(\w*)
\)
/x;
say "cap: $cap cap_no: $cap_no net1: $net1 net2: $net2";
OUTPUT:
cap: 3.58782e-14 cap_no: c_1310 net1: vsub net2: vss_res

Read string within double quotes in perl

I have a file which contains lines like this:
"pin1" Inpin; "pin2" outpin; "pin3" inoutPin;
some other string "pin4" inpin
I want to store just pin1, pin2, pin3, pin4 (basically words within double quotes). Can someone please help..? Basically read one line at a time and grab the word within double quotes only. I tried to split a line by ";" but it doesn't work since ";" may not be present in all lines.
thanks!
join ', ' $str =~ /"([^"]*)"/g
try this code:
my $string = '"pin1" Inpin; "pin2" outpin; "pin3" inoutPin;`';
my #array;
while ( $string =~ /\"(.+?)\"/ )
{
put (array, $1.),
$string =~ s/\"$1\"//;
}

Better way to extract elements from a line using perl?

I want to extract some elements from each line of a file.
Below is the line:
# 1150 Reading location 09ef38 data = 00b5eda4
I would like to extract the address 09ef38 and the data 00b5eda4 from this line.
The way I use is the simple one like below:
while($line = < INFILE >) {
if ($line =~ /\#\s*(\S+)\s*(\S+)\s*(\S+)\s*(\S+)\s*(\S+)\s*=\s*(\S+)/) {
$time = $1;
$address = $4;
$data = $6;
printf(OUTFILE "%s,%s,%s \n",$time,$address,$data);
}
}
I am wondering is there any better idea to do this ? easier and cleaner?
Thanks a lot!
TCGG
Another option is to split the string on whitespace:
my ($time, $addr, $data) = (split / +/, $line)[1, 4, 7];
You could use matching and a list on LHS, something likes this:
echo '# 1150 Reading location 09ef38 data = 00b5eda4' |
perl -ne '
$,="\n";
($time, $addr, $data) = /#\s+(\w+).*?location\s+(\w+).*?data\s*=\s*(\w+)/;
print $time, $addr, $data'
Output:
1150
09ef38
00b5eda4
In python the appropriate regex will be like:
'[0-9]+[a-zA-Z ]*([0-9]+[a-z]+[0-9]+)[a-zA-Z ]*= ([0-9a-zA-Z]+)'
But I don't know exactly how to write it in perl. You can search for it. If you need any explanation of this regexp, I can edit this post with more precise description.
I find it convenient to just split by one or more whitespaces of any kind, using \s+. This way you won't have any problems if the input string has any tab characters in it instead of spaces.
while($line = <INFILE>)
{
my ($time, $addr, $data) = (split /\s+/, $line)[1, 4, 7];
}
When splitting by ANY kind of whitespace it's important to note that it'll also split by the newline at the end, so you'll get an empty element at the end of the return. But in most cases, unless you care about the total amount of elements returned, there's no need to care.

Perl split() Function Not Handling Pipe Character Saved As A Variable

I'm running into a little trouble with Perl's built-in split function. I'm creating a script that edits the first line of a CSV file which uses a pipe for column delimitation. Below is the first line:
KEY|H1|H2|H3
However, when I run the script, here is the output I receive:
Col1|Col2|Col3|Col4|Col5|Col6|Col7|Col8|Col9|Col10|Col11|Col12|Col13|
I have a feeling that Perl doesn't like the fact that I use a variable to actually do the split, and in this case, the variable is a pipe. When I replace the variable with an actual pipe, it works perfectly as intended. How could I go about splitting the line properly when using pipe delimitation, even when passing in a variable? Also, as a silly caveat, I don't have permissions to install an external module from CPAN, so I have to stick with built-in functions and modules.
For context, here is the necessary part of my script:
our $opt_h;
our $opt_f;
our $opt_d;
# Get user input - filename and delimiter
getopts("f:d:h");
if (defined($opt_h)) {
&print_help;
exit 0;
}
if (!defined($opt_f)) {
$opt_f = &promptUser("Enter the Source file, for example /qa/data/testdata/prod.csv");
}
if (!defined($opt_d)) {
$opt_d = "\|";
}
my $delimiter = "\|";
my $temp_file = $opt_f;
my #temp_file = split(/\./, $temp_file);
$temp_file = $temp_file[0]."_add-headers.".$temp_file[1];
open(source_file, "<", $opt_f) or die "Err opening $opt_f: $!";
open(temp_file, ">", $temp_file) or die "Error opening $temp_file: $!";
my $source_header = <source_file>;
my #source_header_columns = split(/${delimiter}/, $source_header);
chomp(#source_header_columns);
for (my $i=1; $i<=scalar(#source_header_columns); $i++) {
print temp_file "Col$i";
print temp_file "$delimiter";
}
print temp_file "\n";
while (my $line = <source_file>) {
print temp_file "$line";
}
close(source_file);
close(temp_file);
The first argument to split is a compiled regular expression or a regular expression pattern. If you want to split on text |. You'll need to pass a pattern that matches |.
quotemeta creates a pattern from a string that matches that string.
my $delimiter = '|';
my $delimiter_pat = quotemeta($delimiter);
split $delimiter_pat
Alternatively, quotemeta can be accessed as \Q..\E inside double-quoted strings and the like.
my $delimiter = '|';
split /\Q$delimiter\E/
The \E can even be omitted if it's at the end.
my $delimiter = '|';
split /\Q$delimiter/
I mentioned that split also accepts a compiled regular expression.
my $delimiter = '|';
my $delimiter_re = qr/\Q$delimiter/;
split $delimiter_re
If you don't mind hardcoding the regular expression, that's the same as
my $delimiter_re = qr/\|/;
split $delimiter_re
First, the | isn't special inside doublequotes. Setting $delimiter to just "|" and then making sure it is quoted later would work or possibly setting $delimiter to "\\|" would be ok by itself.
Second, the | is special inside regex so you want to quote it there. The safest way to do that is ask perl to quote your code for you. Use the \Q...\E construct within the regex to mark out data you want quoted.
my #source_header_columns = split(/\Q${delimiter}\E/, $source_header);
see: http://perldoc.perl.org/perlre.html
It seems as all you want to do is count the fields in the header, and print the header. Might I suggest something a bit simpler than using split?
my $str="KEY|H1|H2|H3";
my $count=0;
$str =~ s/\w+/"Col" . ++$count/eg;
print "$str\n";
Works with most any delimeter (except alphanumeric and underscore), it also saves the number of fields in $count, in case you need it later.
Here's another version. This one uses the character class brackets instead, to specify "any character but this", which is just another way of defining a delimeter. You can specify delimeter from the command-line. You can use your getopts as well, but I just used a simple shift.
my $d = shift || '[^|]';
if ( $d !~ /^\[/ ) {
$d = '[^' . $d . ']';
}
my $str="KEY|H1|H2|H3";
my $count=0;
$str =~ s/$d+/"Col" . ++$count/eg;
print "$str\n";
By using the brackets, you do not need to worry about escaping metacharacters.
#!/usr/bin/perl
use Data::Dumper;
use strict;
my $delimeter="\\|";
my $string="A|B|C|DD|E";
my #arr=split(/$delimeter/,$string);
print Dumper(#arr)."\n";
output:
$VAR1 = 'A';
$VAR2 = 'B';
$VAR3 = 'C';
$VAR4 = 'DD';
$VAR5 = 'E';
seems you need define delimeter as \\|

In Perl, how can I parse a string that might contain many email addresses to get a list of addresses?

I want to split the a string if it contains ; or ,.
For example:
$str = "a#a.com;b#b.com,c#c.com;d#d.com;";
The expected result is:
result[0]="a#a.com";
result[1]="b#b.com";
result[2]="c#c.com";
result[3]="d#d.com";
Sure, you can use split as shown by others. However, if $str contains full blown email addresses, you will be in a world of hurt.
Instead, use Email::Address:
#!/usr/bin/perl
use strict; use warnings;
use Email::Address;
use YAML;
print Dump [ map [$_->name, $_->address ],
Email::Address->parse(
q{a#a.com;"Tester, Test" <test#example.com>,c#c.com;d#d.com}
)
];
Output:
---
-
- a
- a#a.com
-
- 'Tester, Test'
- test#example.com
-
- c
- c#c.com
-
- d
- d#d.com
my $str = 'a#a.com;b#b.com,c#c.com;d#d.com;';
my #result = split /[,;]/, $str;
Note that you can't use double-quotes to assign $str because # is special. That's why I replaced the string delimiters with a single-quote. You could also escape them like so:
my $str = "a\#a.com;b\#b.com,c\#c.com;d\#d.com;";
split(/[.;]/, $str)
You could also use Text::Csv and use either ";" or "," for splitting. It helps to look at other things like printable characters etc as well.
To answer the question in the title of the mail (a little different from its text):
my $str = 'abc#xyz;qwe#rty;';
my #addrs = ($str =~ m/(\w+\#[\w\.]+)/g);
print join("<->", #addrs);
To split by ";" or ","
$test = "abc;def,hij";
#result = split(/[;,]/, $test);
Where the regex means to match on an escaped ; or , character.
The end result will be that #result = ['abc','def','hij']