How to combine two regex pattern in perl? - perl

I want to combine two regex pattern to split string and get a table of integers.
this the example :
$string= "1..1188,1189..14,14..15";
$first_pattern = /\../;
$second_pattern = /\,/;
i want to get tab like that:
[1,1188,1189,14,14,15]

Use | to connect alternatives. Also, use qr// to create regex objects, using plain /.../ matches against $_ and assigns the result to $first_pattern and $second_pattern.
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
my $string = '1..1188,1189..14,14..15';
my $first_pattern = qr/\.\./;
my $second_pattern = qr/,/;
my #integers = split /$first_pattern|$second_pattern/, $string;
say for #integers;
You probably need \.\. to match two dots, as \.. matches a dot followed by anything but a newline. Also, there's no need to backslash a comma.

Related

Perl - Convert integer to text Char(1,2,3,4,5,6)

I am after some help trying to convert the following log I have to plain text.
This is a URL so there maybe %20 = 'space' and other but the main bit I am trying convert is the char(1,2,3,4,5,6) to text.
Below is an example of what I am trying to convert.
select%20char(45,120,49,45,81,45),char(45,120,50,45,81,45),char(45,120,51,45,81,45)
What I have tried so far is the following while trying to added into the char(in here) to convert with the chr($2)
perl -pe "s/(char())/chr($2)/ge"
All this has manage to do is remove the char but now I am trying to convert the number to text and remove the commas and brackets.
I maybe way off with how I am doing as I am fairly new to to perl.
perl -pe "s/word to remove/word to change it to/ge"
"s/(char(what goes in here))/chr($2)/ge"
Output try to achieve is
select -x1-Q-,-x2-Q-,-x3-Q-
Or
select%20-x1-Q-,-x2-Q-,-x3-Q-
Thanks for any help
There's too much to do here for a reasonable one-liner. Also, a script is easier to adjust later
use warnings;
use strict;
use feature 'say';
use URI::Escape 'uri_unescape';
my $string = q{select%20}
. q{char(45,120,49,45,81,45),char(45,120,50,45,81,45),}
. q{char(45,120,51,45,81,45)};
my $new_string = uri_unescape($string); # convert %20 and such
my #parts = $new_string =~ /(.*?)(char.*)/;
$parts[1] = join ',', map { chr( (/([0-9]+)/)[0] ) } split /,/, $parts[1];
$new_string = join '', #parts;
say $new_string;
this prints
select -x1-Q-,-x2-Q-,-x3-Q-
Comments
Module URI::Escape is used to convert percent-encoded characters, per RFC 3986
It is unspecified whether anything can follow the part with char(...)s, and what that might be. If there can be more after last char(...) adjust the splitting into #parts, or clarify
In the part with char(...)s only the numbers are needed, what regex in map uses
If you are going to use regex you should read up on it. See
perlretut, a tutorial
perlrequick, a quick-start introduction
perlre, the full account of syntax
perlreref, a quick reference (its See Also section is useful on its own)
Alright, this is going to be a messy "one-liner". Assuming your text is in a variable called $text.
$text =~ s{char\( ( (?: (?:\d+,)* \d+ )? ) \)}{
my #arr = split /,/, $1;
my $temp = join('', map { chr($_) } #arr);
$temp =~ s/^|$/"/g;
$temp
}xeg;
The regular expression matches char(, followed by a comma-separated list of sequences of digits, followed by ). We capture the digits in capture group $1. In the substitution, we split $1 on the comma (since chr only works on one character, not a whole list of them). Then we map chr over each number and concatenate the result into a string. The next line simply puts quotation marks at the start and end of the string (presumably you want the output quoted) and then returns the new string.
Input:
select%20char(45,120,49,45,81,45),char(45,120,50,45,81,45),char(45,120,51,45,81,45)
Output:
select%20"-x1-Q-","-x2-Q-","-x3-Q-"
If you want to replace the % escape sequences as well, I suggest doing that in a separate line. Trying to integrate both substitutions into one statement is going to get very hairy.
This will do as you ask. It performs the decoding in two stages: first the URI-encoding is decoded using chr hex $1, and then each char() function is translated to the string corresponding to the character equivalents of its decimal parameters
use strict;
use warnings 'all';
use feature 'say';
my $s = 'select%20char(45,120,49,45,81,45),char(45,120,50,45,81,45),char(45,120,51,45,81,45)';
$s =~ s/%(\d+)/ chr hex $1 /eg;
$s =~ s{ char \s* \( ( [^()]+ ) \) }{ join '', map chr, $1 =~ /\d+/g }xge;
say $s;
output
select -x1-Q-,-x2-Q-,-x3-Q-

Why does split return an array with every second element empty?

I'm trying to split a string every 5 characters. The array I'm getting back from split isn't how I'm expecting it: all the even indexes are empty, the parts I'm looking for are on odd indexes.
This version doesn't output anything:
use warnings;
use strict;
my #ar = <DATA>;
foreach (#ar){
my #mkh = split (/(.{5})/,$_);
print $mkh[2];
}
__DATA__
aaaaabbbbbcccccdddddfffff
If I replace the print line with this (odd indexes 1 and 3):
print $mkh[1],"\n", $mkh[3];
The output is the first two parts:
aaaaa
bbbbb
I don't understand this, I expected to be able to print the first two parts with this:
print $mkh[0],"\n", $mkh[1];
Can someone explain what is wrong in my code, and help me fix it?
The first argument in split is the pattern to split on, i.e. it describes what separates your fields. If you put capturing groups in there (as you do), those will be added to the output of the split as specified in the split docs (last paragraph).
This isn't what you want - your separator isn't a group of five characters. You're looking to split a string every X characters. For that, better use:
my #mkh = (/...../g);
# or
my #mkh = (/.{5}/g);
or one of the other options you'll find in: How can I split a string into chunks of two characters each in Perl?
Debug using Data::Dump
To observe exactly what your split operation is doing, use a module like Data::Dump:
use warnings;
use strict;
while (<DATA>) {
my #mkh = split /(.{5})/;
use Data::Dump;
dd #mkh;
}
__DATA__
aaaaabbbbbcccccdddddfffff
Outputs:
("", "aaaaa", "", "bbbbb", "", "ccccc", "", "ddddd", "", "fffff", "\n")
As you can see, your code is splitting on groups of 5 characters, and leaving empty strings between them. This is obviously not what you want.
Use Pattern Matching instead
Instead, you simply want to capture groups of 5 characters. Therefore, you just need a pattern match with the /g Modifier:
use warnings;
use strict;
while (<DATA>) {
my #mkh = /(.{5})/g;
use Data::Dump;
dd #mkh;
}
__DATA__
aaaaabbbbbcccccdddddfffff
Outputs:
("aaaaa", "bbbbb", "ccccc", "ddddd", "fffff")
You can also use zero-width delimiter, which can be described as split string at places which are in front of 5 chars (by using \K positive look behind)
my #mkh = split (/.{5}\K/, $_);

Split functions

I want to get the split characters. I tried the below coding, but I can able to get the splitted text only. However if the split characters are same then it should be returned as that single characters
For example if the string is "asa,agas,asa" then only , should be returned.
So in the below case I should get as "| : ;" (joined with space)
use strict;
use warnings;
my $str = "Welcome|a:g;v";
my #value = split /[,;:.%|]/, $str;
foreach my $final (#value) {
print $final, "\n";
}
split splits a string into elements when given what separates those elements, so split is not what you want. Instead, use:
my #punctuations = $str =~ /([,;:.%|])/g;
So you want to get the opposite of split
try:
my #value=split /[^,;:.%|]+/,$str;
It will split on anything but the delimiters you set.
Correction after commnets:
my #value=split /[^,;:.%|]+/,$str;
shift #value;
this works fine, and gives unique answers
#value = ();
foreach(split('',",;:.%|")) { push #value,$_ if $str=~/$_/; }
To extract all the separators only once, you need something more elaborate
my #punctuations = keys %{{ map { $_ => 1 } $str =~ /[,;:.%|]/g }};
Sounds like you call "split characters" what the rest of us call "delimiters" -- if so, the POSIX character class [:punct:] might prove valuable.
OTOH, if you have a defined list of delimiters, and all you want to do is list the ones present in the string, it's much more efficient to use m// rather than split.

Perl string in Quote Word?

Seem like my daily road block. Is this possible? String in qw?
#!/usr/bin/perl
use strict;
use warnings;
print "Enter Your Number\n";
my $usercc = <>;
##split number
$usercc =~ s/(\w)(?=\w)/$1 /g;
print $usercc;
## string in qw, hmm..
my #ccnumber = qw($usercc);
I get Argument "$usercc" isn't numeric in multiplication (*) at
Thanks
No.
From: http://perlmeme.org/howtos/perlfunc/qw_function.html
How it works
qw() extracts words out of your string
using embedded whitsepace as the
delimiter and returns the words as a
list. Note that this happens at
compile time, which means that the
call to qw() is replaced with the list
before your code starts executing.
Additionlly, no interpolation is possible in the string you pass to qw().
Instead of that, use
my #ccnumber = split /\s+/, $usercc;
Which does what you probably want, to split $usercc on whitespace.

Extracting text in between strings in Perl

In Perl, how do I extract the text in a string if I knew the pre and post parts of the text?
Example:
Input: www.google.com/search?size=1&make=BMW&model=2000
I would like to extract the word 'BMW' which is always in between "&make=" and the next "&"
Don't use a regular expression. Use URI and URI::QueryParam, like so:
use strict;
use warnings;
use URI;
use URI::QueryParam;
my $u = URI->new('http://www.google.com/search?size=1&make=BMW&model=2000');
print $u->query_param('make');
Use a Regular expression:
my ($captured_string) = $link =~ /\&make=(\w+)\&/;
My regex assumes that you would want to capture anything that appeared in the make field. \w captures upper and lower case letters. If you want to capture something else you can use a character class. Like this [\w\s]+ would match more than one letters and spaces. You can add anything between the [ ] of characters to match in any order.
The ( ) is what actually does the capturing. If you remove that then it will just match (and you should use it in an if statement. If you wanted capture more than one string (say you wanted the model as well. Based on your example you could use a second set of parenthesis like this:
my ($make, $model) = $link =~ /\&make=(\w+)\&model=([A-Za-z0-9]+)/;
Hope that helps!