Perl colon split show one character - perl

Perl colon split show one character when using it in variables.
Code
#!/usr/bin/perl
my $serialnumber= "0123456789";
my $macaddr = "a1:b2:c3:d4:e5:f6";
my $macaddress = split /:/, $macaddr;
print $macaddress;
print "\n";
The result is only "6" or last character of the text.
See image...

Sean explained why you are getting 6. However, instead of using split, just substitute to get rid of colons:
my $macaddr = "a1:b2:c3:d4:e5:f6";
$macaddr =~ s/://g;
print $macaddr;
# a1b2c3d4e5f6

The first paragraph of perldoc -f split says:
Splits the string EXPR into a list of strings and returns the
list in list context, or the size of the list in scalar context.
Be assigning the result of split to the scalar variable $macaddress, you're providing scalar context, and so you're getting the size of the list back. It's just a coincidence that this happens to be the same as the last character of your input string.

You could use split in list context:
my #macaddress = split /:/, $macaddr;
print join "", #macaddress;

Related

Perl - Convert integer to text Char(1,2,3,4,5,6)

I am after some help trying to convert the following log I have to plain text.
This is a URL so there maybe %20 = 'space' and other but the main bit I am trying convert is the char(1,2,3,4,5,6) to text.
Below is an example of what I am trying to convert.
select%20char(45,120,49,45,81,45),char(45,120,50,45,81,45),char(45,120,51,45,81,45)
What I have tried so far is the following while trying to added into the char(in here) to convert with the chr($2)
perl -pe "s/(char())/chr($2)/ge"
All this has manage to do is remove the char but now I am trying to convert the number to text and remove the commas and brackets.
I maybe way off with how I am doing as I am fairly new to to perl.
perl -pe "s/word to remove/word to change it to/ge"
"s/(char(what goes in here))/chr($2)/ge"
Output try to achieve is
select -x1-Q-,-x2-Q-,-x3-Q-
Or
select%20-x1-Q-,-x2-Q-,-x3-Q-
Thanks for any help
There's too much to do here for a reasonable one-liner. Also, a script is easier to adjust later
use warnings;
use strict;
use feature 'say';
use URI::Escape 'uri_unescape';
my $string = q{select%20}
. q{char(45,120,49,45,81,45),char(45,120,50,45,81,45),}
. q{char(45,120,51,45,81,45)};
my $new_string = uri_unescape($string); # convert %20 and such
my #parts = $new_string =~ /(.*?)(char.*)/;
$parts[1] = join ',', map { chr( (/([0-9]+)/)[0] ) } split /,/, $parts[1];
$new_string = join '', #parts;
say $new_string;
this prints
select -x1-Q-,-x2-Q-,-x3-Q-
Comments
Module URI::Escape is used to convert percent-encoded characters, per RFC 3986
It is unspecified whether anything can follow the part with char(...)s, and what that might be. If there can be more after last char(...) adjust the splitting into #parts, or clarify
In the part with char(...)s only the numbers are needed, what regex in map uses
If you are going to use regex you should read up on it. See
perlretut, a tutorial
perlrequick, a quick-start introduction
perlre, the full account of syntax
perlreref, a quick reference (its See Also section is useful on its own)
Alright, this is going to be a messy "one-liner". Assuming your text is in a variable called $text.
$text =~ s{char\( ( (?: (?:\d+,)* \d+ )? ) \)}{
my #arr = split /,/, $1;
my $temp = join('', map { chr($_) } #arr);
$temp =~ s/^|$/"/g;
$temp
}xeg;
The regular expression matches char(, followed by a comma-separated list of sequences of digits, followed by ). We capture the digits in capture group $1. In the substitution, we split $1 on the comma (since chr only works on one character, not a whole list of them). Then we map chr over each number and concatenate the result into a string. The next line simply puts quotation marks at the start and end of the string (presumably you want the output quoted) and then returns the new string.
Input:
select%20char(45,120,49,45,81,45),char(45,120,50,45,81,45),char(45,120,51,45,81,45)
Output:
select%20"-x1-Q-","-x2-Q-","-x3-Q-"
If you want to replace the % escape sequences as well, I suggest doing that in a separate line. Trying to integrate both substitutions into one statement is going to get very hairy.
This will do as you ask. It performs the decoding in two stages: first the URI-encoding is decoded using chr hex $1, and then each char() function is translated to the string corresponding to the character equivalents of its decimal parameters
use strict;
use warnings 'all';
use feature 'say';
my $s = 'select%20char(45,120,49,45,81,45),char(45,120,50,45,81,45),char(45,120,51,45,81,45)';
$s =~ s/%(\d+)/ chr hex $1 /eg;
$s =~ s{ char \s* \( ( [^()]+ ) \) }{ join '', map chr, $1 =~ /\d+/g }xge;
say $s;
output
select -x1-Q-,-x2-Q-,-x3-Q-

Why are my double quotes are missing in the output?

I tried to convert strings from lower case to upper case. While I achieved that, I found my double quotes were removed from the output. I want to be clear why the substitution fails here.
Perl code:
#!/usr/bin/perl
use strict;
use warnings;
my #ar = <DATA>;
my $str = join '', #ar;
#print $str;
if ( $str =~ s/\"([^"]*)\"/uc($1)/eg ) {
print $str;
}
__DATA__
output = " as (10) print "it if achieved. print" # This is comment.NUMBERS"
Obtained output:
output = AS (10) PRINT it if achieved. print # THIS IS COMMENT.NUMBERS
Expected output:
output = " AS (10) PRINT "it if achieved. print" # THIS IS COMMENT.NUMBERS"
The quotes are all disappearing because they're outside the capture in your regex. Changing it to s/("[^"]*")/uc($1)/eg gets us your desired output:
output = " AS (10) PRINT "it if achieved. print" # THIS IS COMMENT.NUMBERS"
Note that the inner section isn't mapped to all caps. This is a hint to how the regex is actually matching things in your string - it finds two matches: " as (10) print " and " # This is comment.NUMBERS". The text inside the inner quotes isn't matched by the regex at all. This is important, because it means that this solution is fragile and it will break on data which doesn't contain two nested sets of quotes and you want to capitalize only the text which is inside the outer quotes, but outside the inner quotes.
s/\"([^"]*)\"/uc($1)/eg - $1 represents match inside brackets (). You have include double quotes in match to be replaced but outside brackets.
Possible fix:
s/(\"[^"]*\")/uc($1)/eg

Perl: Find a match, remove the same lines, and to get the last field

Being a Perl newbie, please pardon me for asking this basic question.
I have a text file #server1 that shows a bunch of sentences (white space is the field separator) on many lines in the file.
I needed to match lines with my keyword, remove the same lines, and extract only the last field, so I have tried with:
my #allmatchedlines;
open(output1, "ssh user1#server1 cat /tmp/myfile.txt |");
while(<output1>) {
chomp;
#allmatchedlines = $_ if /mysearch/;
}
close(output1);
my #uniqmatchedline = split(/ /, #allmatchedlines);
my $lastfield = $uniqmatchedline[-1]\n";
print "$lastfield\n";
and it gives me the output showing:
1
I don't know why it's giving me just "1".
Could someone please explain why I'm getting "1" and how I can get the last field of the matched line correctly?
Thank you!
my #uniqmatchedline = split(/ /, #allmatchedlines);
You're getting "1" because split takes a scalar, not an array. An array in scalar context returns the number of elements.
You need to split on each individual line. Something like this:
my #uniqmatchedline = map { split(/ /, $_) } #allmatchedlines;
There are two issues with your code:
split is expecting a scalar value (string) to split on; if you are passing an array, it will convert the array to scalar (which is just the array length)
You did not have a way to remove same lines
To address these, the following code should work (not tested as no data):
my #allmatchedlines;
open(output1, "ssh user1#server1 cat /tmp/myfile.txt |");
while(<output1>) {
chomp;
#allmatchedlines = $_ if /mysearch/;
}
close(output1);
my %existing;
my #uniqmatchedline = grep !$existing{$_}++, #allmatchedlines; #this will return the unique lines
my #lastfields = map { ((split / /, $_)[-1]) . "\n" } #uniqmatchedline ; #this maps the last field in each line into an array
print for #lastfields;
Apart from two errors in the code, I find the statement "remove the same lines and extract only the last field" unclear. Once duplicate matching lines are removed, there may still be multiple distinct sentences with the pattern.
Until a clarification comes, here is code that picks the last field from the last such sentence.
use warnings 'all';
use strict;
use List::MoreUtils qw(uniq)
my $file = '/tmp/myfile.txt';
my $cmd = "ssh user1\#server1 cat $file";
open my $fh, '-|', $cmd // die "Error opening $cmd: $!"; # /
while (<$fh>) {
chomp;
push #allmatchedlines, $_ if /mysearch/;
}
close(output1);
my #unique_matched_lines = uniq #allmatchedlines;
my $lastfield = ( split ' ', $unique_matched_lines[-1] )[-1];
print $lastfield, "\n";
I changed to the three-argument open, with error checking. Recall that open for a process involves a fork and returns pid, so an "error" doesn't at all relate to what happened with the command itself. See open. (The # / merely turns off wrong syntax highlighting.) Also note that # under "..." indicates an array and thus need be escaped.
The (default) pattern ' ' used in split splits on any amount of whitespace. The regex / / turns off this behavior and splits on a single space. You most likely want to use ' '.
For more comments please see the original post below.
The statement #allmatchedlines = $_ if /mysearch/; on every iteration assigns to the array, overwriting whatever has been in it. So you end up with only the last line that matched mysearch. You want push #allmatchedlines, $_ ... to get all those lines.
Also, as shown in the answer by Justin Schell, split needs a scalar so it is taking the length of #allmatchedlines – which is 1 as explained above. You should have
my #words_in_matched_lines = map { split } #allmatchedlines;
When all this is straightened out, you'll have words in the array #uniqmatchedline and if that is the intention then its name is misleading.
To get unique elements of the array you can use the module List::MoreUtils
use List::MoreUtils qw(uniq);
my #unique_elems = uniq #whole_array;

In Perl, how can I tell split not to strip empty trailing fields?

Was trying to count the number of lines in a string of text (including empty lines). A little surprised by the behavior of split. Had expected the following to output 2 but it printed 1 on my perl 5.14.2.
$str = "hello\
world\n\n";
#a = split(/\n/, $str);
print $#a, "\n";
Seems that split() is insensitive to consecutive \n (add more \n's at the end of the string will not increase the printout). The only I can get it sort of close to giving the number of lines is
$str = "hello\
world\n\n";
#a = split(/(\n)/, $str);
printf "%d\n", ($#a + 1)/2, "\n";
But it looks more like a workaround than a straight solution. Any ideas?
perldoc -f split:
If LIMIT is negative, it is treated as if it were instead
arbitrarily large; as many fields as possible are produced.
If LIMIT is omitted (or, equivalently, zero), then it is usually
treated as if it were instead negative but with the exception that
trailing empty fields are stripped (empty leading fields are
always preserved); if all fields are empty, then all fields are
considered to be trailing (and are thus stripped in this case).
$ perl -E 'my $x = "1\n2\n\n"; my #x = split /\n/, $x, -1; say $#x'
3
Perhaps the problem is that you are using $#a when scalar #a is what you are actually looking for?
I apologize if you are already aware of this or if this is not the issue, but $#a returns the index of the last element of #a and (scalar #a) returns the number of elements that #a contains. Since array indexing starts at 0, $#a is one less than scalar #a.

Split functions

I want to get the split characters. I tried the below coding, but I can able to get the splitted text only. However if the split characters are same then it should be returned as that single characters
For example if the string is "asa,agas,asa" then only , should be returned.
So in the below case I should get as "| : ;" (joined with space)
use strict;
use warnings;
my $str = "Welcome|a:g;v";
my #value = split /[,;:.%|]/, $str;
foreach my $final (#value) {
print $final, "\n";
}
split splits a string into elements when given what separates those elements, so split is not what you want. Instead, use:
my #punctuations = $str =~ /([,;:.%|])/g;
So you want to get the opposite of split
try:
my #value=split /[^,;:.%|]+/,$str;
It will split on anything but the delimiters you set.
Correction after commnets:
my #value=split /[^,;:.%|]+/,$str;
shift #value;
this works fine, and gives unique answers
#value = ();
foreach(split('',",;:.%|")) { push #value,$_ if $str=~/$_/; }
To extract all the separators only once, you need something more elaborate
my #punctuations = keys %{{ map { $_ => 1 } $str =~ /[,;:.%|]/g }};
Sounds like you call "split characters" what the rest of us call "delimiters" -- if so, the POSIX character class [:punct:] might prove valuable.
OTOH, if you have a defined list of delimiters, and all you want to do is list the ones present in the string, it's much more efficient to use m// rather than split.