preg_replace ( - ) between 2 constant texts - preg-replace

How i can preg_replace with a comma between 2 constant texts
blablabalba","color:Metal Black - White;sex:blablabal"
I want to be:
blablabalba","Metal Black,White"

Not sure I well understand your needs, but is this what you want:
$str = 'blablabalba","color:Metal Black - White;sex:blablabal"';
$str = preg_replace('/"color:([^-]+) - (\w+)[^"]+"/u', '"$1,$2"', $str);

Related

Perl: break down a string, with some unique constraints

I'm using Perl to feed data to an LCD display. The display is 8 characters wide. The strings of data to be displayed are always significantly longer than 8 characters. As such, I need to break the strings down into "frames" of 8 characters or less, and feed the "frames" to the display one at a time.
The display is not intelligent enough to do this on its own. The only convenience it offers is that strings of less than 8 characters are automatically centered on the display.
In the beginning, I simply sent the string 8 characters at a time - here goes 1-8, now 9-16, now 17-24, etc. But that wasn't especially nice-looking. I'd like to do something better, but I'm not sure how best to approach it.
These are the constraints I'd like to implement:
Fit as many words into a "frame" as possible
No starting/trailing space(s) in a "frame"
Symbol (ie. hyphen, ampersand, etc) with a space on both sides qualifies as a word
If a word is longer than 8 characters, simulate per-character scrolling
Break words longer than 8 characters at a slash or hyphen
Some hypothetical input strings, and desired output for each...
Electric Light Orchestra - Sweet Talkin' Woman
Electric
Light
Orchestr
rchestra
- Sweet
Talkin'
Woman
Quarterflash - Harden My Heart
Quarterf
uarterfl
arterfla
rterflas
terflash
- Harden
My Heart
Steve Miller Band - Fly Like An Eagle
Steve
Miller
Band -
Fly Like
An Eagle
Hall & Oates - Did It In A Minute
Hall &
Oates -
Did It
In A
Minute
Bachman-Turner Overdrive - You Ain't Seen Nothing Yet
Bachman-
Turner
Overdriv
verdrive
- You
Ain't
Seen
Nothing
Yet
Being a relative Perl newbie, I'm trying to picture how would be best to handle this. Certainly I could split the string into an array of individual words. From there, perhaps I could loop through the array, counting the letters in each subsequent word to build the 8-character "frames". Upon encountering a word longer than 8 characters, I could then repetitively call substr on that word (with offset +1 each time), creating the illusion of scrolling.
Is this a reasonable way to accomplish my goal? Or am I reinventing the wheel here? How would you do it?
The base question is to find all consecutive overlapping N-long substrings in a compact way.
Here it is in one pass with a regex, and see the end for doing it using substr.
my $str = join '', "a".."k"; # 'Quarterflash';
my #eights = $str =~ /(?=(.{8}))/g;
This uses a lookahead which also captures, and in this way the regex crawls up the string character by character, capturing the "next" eight each time.
Once we are at it, here is also a basic solution for the problem. Add words to a buffer until it would exceed 8 characters, at which point it is added to an array of display-ready strings and cleared.
use warnings;
use strict;
use feature 'say';
my $str = shift // "Quarterflash - Harden My Heart";
my #words = split ' ', $str;
my #to_display;
my $buf = '';
foreach my $w (#words) {
if (length $w > 8) {
# Now have to process the buffer first then deal with this long word
push #to_display, $buf;
$buf = '';
push #to_display, $w =~ /(?=(.{8}))/g;
}
elsif ( length($buf) + 1 + length($w) > 8 ) {
push #to_display, $buf;
$buf = $w;
}
elsif (length $buf != 0) { $buf .= ' ' . $w }
else { $buf = $w }
}
push #to_display, $buf if $buf;
say for #to_display;
This is clearly missing some special/edge cases, in particular those involving non-word characters and hyphenated words, but that shouldn't be too difficult to add.†
Here is a way to get all consecutive 8-long substrings using substr
my #to_display = map { substr $str, $_, 8 } 0..length($str)-8;
† Example, break a word with hyphen/slash when it has no spaces around it (per question)
my #parts = split m{\s+|(?<=\S)[-/](?=\S)}, $w;
The hyphen/slash is discarded as this stands; that can be changed by capturing the pattern as well and then filtering out elements with only spaces
my #parts = grep { /\S/ } split m{( \s+ | (?<=\S) [-/] (?=\S) )}x, $w;
These haven't been tested beyond just barely. Can fit in the if (length $w > 8) branch.
The initial take-- The regex was originally written with a two-part pattern. Keeping it here for record and as an example of use of pair-handling functions from List::Util
The regex below matches and captures a character, followed by a lookahead for the next seven, which it also captures. This way the engine captures 1 and 7-long substrings as it moves along char by char. Then the consecutive pairs from the returned list are joined
my $str = join '', "a".."k"; # 'Quarterflash';
use List::Util qw(pairmap);
my #eights = pairmap { $a . $b } $str =~ /(. (?=(.{7})) )/gx;
# or
# use List::Util qw(pairs);
# my #eights = map { join '', #$_ } pairs $str =~ /(.(?=(.{7})))/g;

Finding index of white space in Perl

I'm trying to find the index of white space in a string in Perl.
For example, if I have the string
stuff/more stuffhere
I'd like to select the word "more" with a substring method. I can find the index of "/" but haven't figured out how to find the index of white space. The length of the substring I'm trying to select will vary, so I can't hard code the index. There will only be one white space in the string (other than those after the end of the string).
Also, if anybody has any better ideas of how to do this, I'd appreciate hearing them. I'm fairly new to programming so I'm open to advice. Thanks.
Just use index:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
my $string = 'stuff/more stuffhere';
my $index_of_slash = index $string, '/';
my $index_of_space = index $string, ' ';
say "Between $index_of_slash and $index_of_space.";
The output is
Between 5 and 10.
Which is correct:
0 1
01234567890123456789
stuff/more stuffhere
If by "whitespace" you also mean tabs or whatever, you can use a regular expression match and the special variables #- and #+:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
my $string = "stuff/more\tstuffhere";
if ($string =~ m{/.*(?=\s)}) {
say "Between $-[0] and $+[0]";
}
The (?=\s) means is followed by a whitespace character, but the character itself is not part of the match, so you don't need to do any maths on the returned values.
As you stated, you want to select the word between the first /
and the first space following it.
If this is the case, you maybe don't need any index (you need just
the word).
A perfect tool to find something in a text is regex.
Look at the following code:
$txt = 'stuff/more stuffxx here';
if ($txt =~ /\/(.+?) /) {
print "Match: $1.\n";
}
The regex used tries to match:
a slash,
a non-empty sequence of any chars (note ? - reluctant
version), enclosed in a capturing group,
a space.
So after the match $1 contains what was captured by the first
capturing group, i.e. "your" word.
But if for any reason you were interested in starting and ending
offsets to this word, you can read them from $-[1]
and $+[1] (starting / ending indices of the first capturing group).
The arrays #- (#LAST_MATCH_START) and #+ (#LAST_MATCH_END) give offsets of the start and end of last successful submatches. See Regex related variables in perlvar.
You can capture your real target, and then read off the offset right after it with $+[0]
#+
This array holds the offsets of the ends of the last successful submatches in the currently active dynamic scope. $+[0] is the offset into the string of the end of the entire match. This is the same value as what the pos function returns when called on the variable that was matched against.
Example
my $str = 'target and target with spaces';
while ($str =~ /(target)\s/g)
{
say "Position after match: $+[0]"
}
prints
Position after match: 7
Position after match: 18
These are positions right after 'target', so of spaces that come after it.
Or you can capture \s instead and use $-[1] + 1 (first position of the match, the space).
You can use
my $str = "stuff/more stuffhere";
if ($str =~ m{/\K\S+}) {
... substr($str, $-[0], $+[0] - $-[0]) ...
}
But why substr? That's very weird there. Maybe if you told us what you actually wanted to do, we could provide a better alternatives. Here are three cases:
Data extraction:
my $str = "stuff/more stuffhere";
if ( my ($word) = $str =~ m{/(\S+)} ) {
say $word; # more
}
Data replacement:
my $str = "stuff/more stuffhere";
$str =~ s{/\K\S+}{REPLACED};
say $str; # stuff/REPLACED stuffhere
Data replacement (dynamic):
my $str = "stuff/more stuffhere";
$str =~ s{/\K(\S+)}{ uc($1) }e;
say $str; # stuff/MORE stuffhere

Split Variable on white space [duplicate]

This question already has answers here:
Using perl to split a line that may contain whitespace
(5 answers)
Closed 9 years ago.
I'm trying to split a string into an array with the split occurring at the white spaces. Each block of text is seperated by numerous (variable) spaces.
Here is the string:
NUM8 host01 1,099,849,993 1,099,849,992 1
I have tried the following without success.
my #array1 = split / /, $VAR1;
my #array1 = split / +/, $VAR1;
my #array1 = split /\s/, $VAR1;
my #array1 = split /\s+/, $VAR1;
I'd like to end up with:
$array1[0] = NUM8
$array1[1] = host01
$array1[2] = 1,099,849,993
$array1[3] = 1,099,849,992
$array1[4] = 1
What is the best way to split this?
If the first argument to split is the string ' ' (the space), it is special. It should match whitespace of any size:
my #array1 = split ' ', $VAR1;
(BTW, it is almost equivalent to your last option, but it also removes any leading whitespace.)
Just try using:
my #array1 = split(' ',$VAR1);
Codepad Demo
From Perldoc:
As another special case, split emulates the default behavior of the
command line tool awk when the PATTERN is either omitted or a literal
string composed of a single space character (such as ' ' or "\x20" ,
but not e.g. / / ). In this case, any leading whitespace in EXPR is
removed before splitting occur
\s+ matches 1 or more whitespaces, and split on them
my #array1 = split /\s+/, $VAR1;

Not able to split a string in perl - getting unmatched ( in regex; marked by (-- Here in m/ error

The below is my code
$var = ' "jjjjjjjj&Q_30006_47=540IT%20(540%2FOR%2FHPSC%2FD%2F02%2F11&Q_30006_4=&Q_30006_6=12&Q_30006_7=&Q_30006_" &';
($temp1,$temp2) = split($var,"&");
print $temp1;
I need to get
$temp1 = "jjjjjjjj
and
$temp2 as the remaining part of the string after the first &.
I am getting error because of the '(' in the string.
Can anyone please advise on how to split this.
Thanks!!
I think that you have the parameter orders wrong. The pattern should be first:
($temp1,$temp2) = split("&", $var);
However, that will split on all & characters. You probably are looking for this (the 2 is the limit):
($temp1,$temp2) = split("&", $var, 2);
($temp1,$temp2) = split '&', $var, 2;

In Perl, how can I parse a string that might contain many email addresses to get a list of addresses?

I want to split the a string if it contains ; or ,.
For example:
$str = "a#a.com;b#b.com,c#c.com;d#d.com;";
The expected result is:
result[0]="a#a.com";
result[1]="b#b.com";
result[2]="c#c.com";
result[3]="d#d.com";
Sure, you can use split as shown by others. However, if $str contains full blown email addresses, you will be in a world of hurt.
Instead, use Email::Address:
#!/usr/bin/perl
use strict; use warnings;
use Email::Address;
use YAML;
print Dump [ map [$_->name, $_->address ],
Email::Address->parse(
q{a#a.com;"Tester, Test" <test#example.com>,c#c.com;d#d.com}
)
];
Output:
---
-
- a
- a#a.com
-
- 'Tester, Test'
- test#example.com
-
- c
- c#c.com
-
- d
- d#d.com
my $str = 'a#a.com;b#b.com,c#c.com;d#d.com;';
my #result = split /[,;]/, $str;
Note that you can't use double-quotes to assign $str because # is special. That's why I replaced the string delimiters with a single-quote. You could also escape them like so:
my $str = "a\#a.com;b\#b.com,c\#c.com;d\#d.com;";
split(/[.;]/, $str)
You could also use Text::Csv and use either ";" or "," for splitting. It helps to look at other things like printable characters etc as well.
To answer the question in the title of the mail (a little different from its text):
my $str = 'abc#xyz;qwe#rty;';
my #addrs = ($str =~ m/(\w+\#[\w\.]+)/g);
print join("<->", #addrs);
To split by ";" or ","
$test = "abc;def,hij";
#result = split(/[;,]/, $test);
Where the regex means to match on an escaped ; or , character.
The end result will be that #result = ['abc','def','hij']