Perl Split on first occourence - perl

Suppose string is:
ABC-Digest-M2-2.03-04.01.00.05
I want to split "ABC-Digest-M2" and "2.03-04.01.00.05" in two strings.
First occurrence of - and digit combination. "-\d".
How can I do this with one line of code ?

You can use split with a lookahead assertion to do this without consuming the digit. e.g.
perl -MData::Dumper -e 'print Dumper(
split /-(?=\d)/, "ABC-Digest-M2-2.03-04.01.00.05", 2
);'
$VAR1 = 'ABC-Digest-M2';
$VAR2 = '2.03-04.01.00.05';

Split on dash - followed by digit, and limit split() to max number of fields,
my $string = "ABC-Digest-M2-2.03-04.01.00.05";
my ($p1, $p2) = split /-(?=\d)/, $string, 2;

Related

Replace single space with multiple spaces in perl

I have a requirement of replacing a single space with multiple spaces so that the second field always starts at a particular position (here 36 is the position of second field always).
I have a perl script written for this:
while(<INP>)
{
my $md=35-index($_," ");
my $str;
$str.=" " for(1..$md);
$_=~s/ +/$str/;
print "$_" ;
}
Is there any better approach with just using the regex in =~s/// so that I can use it on CLI directly instead of script.
Assuming that the fields in your data are demarcated by spaces
while (<$fh>) {
my ($first, #rest) = split;
printf "%-35s #rest\n", $first;
}
The first field is now going to be 36 wide, aligned left due to - in the format of printf. See sprintf for the many details. The rest is printed with single spaces between the original space-separated fields, but can instead be done as desired (tab separated, fixed width...).
Or you can leave the "rest" after the first field untouched by splitting the line into two parts
while (<$fh>) {
my ($first, $rest) = /(\S+)\s+(.*)/;
printf "%-35s $rest\n", $first;
}
(or use split ' ', $_, 2 instead of regex)
Please give more detail if there are other requirements.
One approach is to use plain ol' Perl formats:
#!/usr/bin/perl
use warnings;
use strict;
my($first, $second, $remainder);
format STDOUT =
#<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< #<<<<<< #<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
$first, $second,$remainder
.
while (<DATA>) {
($first, $second, $remainder) = split(/\s+/, $_, 3);
write;
}
exit 0;
__DATA__
ABCD TEST EFGH don't touch
FOO BAR FUD don't touch
Test output. I probably miscounted the columns, but you should get the idea:
$ perl dummy.pl
ABCD TEST EFGH don't touch
FOO BAR FUD don't touch
Other option would be Text::Table

Using Perl, search and replace a number string working backwards on that string

How can I do a global search and replace on a string of numbers starting from the end of the string and reading backwards?
Starting at the front of the string I can do this:
someword: 12345
s/someword: [0-9][0-9]/someword: ==/g;
someword: ==345
But that will only work if the string is five numbers long. Regardless of the length of the number string, I want to keep the last three numbers.
Thank you
I would use an executable substitution.
This code finds multiple digits that are followed by three more digits, and replaces them with the same number of equals = signs
my $s = 'someword: 12345678';
$s =~ s/ (\d+) (?=\d{3}) / '=' x length $1 /xe;
print $s;
output
someword: =====678
Just use a positive lookahead assertion:
my $string = 'someword: 12345';
$string =~ s/\d(?=\d{3})/=/g;
print "$string\n";
Outputs:
someword: ==345

issue in matching regexp in perl

I am having following code
$str = "
OTNPKT0553 04-02-03 21:43:46
M X DENY
PLNA
/*Privilege, Login Not Active*/
;";
$val = $str =~ /[
]*([\n]?[\n]+
[\n]?) ([^;^
]+)/s;
print "$1 and $2";
Getting output as
and PLNA
Why it is getting PLNA as output. I believe it should stop at first\n. I assume output should be OTNPKT0553 04-02-03 21:43:46
Your regex is messy and contains a lot of redundancy. The following steps demonstrate how it can be simplified and then it becomes more clear why it is matching PLNA.
1) Translating the literal new lines in your regex:
$val = $str =~ /[\n\n]*([\n]?[\n]+\n[\n]?) ([^;^\n]+)/s;
2) Then simplifying this code to remove the redundancy:
$val = $str =~ /(\n{2}) ([^;^\n]+)/s;
So basically, the regex is looking for two new lines followed by 3 spaces.
There are three spaces before OTNPKT0553, but there is only a single new line, so it won't match.
The next three spaces are before PLNA which IS preceded by two new lines, and so matches.
You have a whole lot of newlines in there - some literal and some encoded as \n. I'm not clear how you were thinking. Did you think \n matched a number maybe? A \d matches a digit, and will also match many Unicode characters that are digits in other languages. However for simple ASCII text it works fine.
What you need is something like this
use strict;
use warnings;
my $str = "
OTNPKT0553 04-02-03 21:43:46
M X DENY
PLNA
/*Privilege, Login Not Active*/
;";
my $val = $str =~ / (\w+) \s+ ( [\d-]+ \s [\d:]+ ) /x;
print "$1 and $2";
output
OTNPKT0553 and 04-02-03 21:43:46
You have an extra line feed, change the regex to:
$str =~ /[
]*([\n]?[\n]+[\n]?) ([^;^
]+)/s;
and simpler:
$str =~ /\n+ ([^;^\n]+)/s;

Split a perl string with a substring and a space

local_addr = sjcapp [value2]
How do you split this string so that I get 2 values in my array i.e.
array[0] = sjcapp and array[1] = value2.
If I do this
#array = split('local_addr =', $input)
then my array[0] has sjcapp [value2]. I want to be able to separate it into two in my split function itself.
I was trying something like this but it didn't work:
split(/local_addr= \s/, $input)
Untested, but maybe something like this?
#array = ($input =~ /local_addr = (\S+)\s\[(\S+)\]/);
Rather than split, this uses a regex match in list context, which gives you an array of the parts captured in parentheses.
~/ cat data.txt
local_addr = sjcapp [value2]
other_addr = superman [value1492]
euro_addr = overseas [value0]
If the data really is as regularly structured as that , then you can just split on the whitespace. On the command line (see the perlrun(1) manual page) this is easiest with "autosplit" (-a) which magically creates an array of fields called #F from the input:
perl -lane 'print "$F[2] $F[3]" ' data.txt
sjcapp [value2]
superman [value1492]
overseas [value0]
In your script you can change the name of array, and the position of the elements within,it by shift-ing or splice-ing - possibly in a more elegant way than this - but it works:
perl -lane 'my #array = ($F[2],$F[3]) ; print "$array[0], $array[1]" ' data.txt
Or, without using autosplit, as follows :
perl -lne 'my #arr=split(" ");splice(#arr,0,2); print "$arr[0] $arr[1]"' data.txt
try :
if ( $input =~ /(=)(.+)(\[)(.+)(\])/ ) {
#array=($2,$4);
}
I would use a regexp rather than a split, since this is clearly a standard format config file line. How you construct your regexp will likely depend on the full line syntax and how flexible you want to be.
if( $input =~ /(\S+)\s*=\s*(\S+)\s*\[\s*(\S+)\s*\]/ ) {
#array = ($2,$3);
}

Split Variable on white space [duplicate]

This question already has answers here:
Using perl to split a line that may contain whitespace
(5 answers)
Closed 9 years ago.
I'm trying to split a string into an array with the split occurring at the white spaces. Each block of text is seperated by numerous (variable) spaces.
Here is the string:
NUM8 host01 1,099,849,993 1,099,849,992 1
I have tried the following without success.
my #array1 = split / /, $VAR1;
my #array1 = split / +/, $VAR1;
my #array1 = split /\s/, $VAR1;
my #array1 = split /\s+/, $VAR1;
I'd like to end up with:
$array1[0] = NUM8
$array1[1] = host01
$array1[2] = 1,099,849,993
$array1[3] = 1,099,849,992
$array1[4] = 1
What is the best way to split this?
If the first argument to split is the string ' ' (the space), it is special. It should match whitespace of any size:
my #array1 = split ' ', $VAR1;
(BTW, it is almost equivalent to your last option, but it also removes any leading whitespace.)
Just try using:
my #array1 = split(' ',$VAR1);
Codepad Demo
From Perldoc:
As another special case, split emulates the default behavior of the
command line tool awk when the PATTERN is either omitted or a literal
string composed of a single space character (such as ' ' or "\x20" ,
but not e.g. / / ). In this case, any leading whitespace in EXPR is
removed before splitting occur
\s+ matches 1 or more whitespaces, and split on them
my #array1 = split /\s+/, $VAR1;