I am needing to obtain the algorithm used in this little bit of Perl code, but I know nothing about Perl. Usually that's not a problem since I will research the language, but this regular expression stuff is way over my head!
Could anybody pseudo-code this? I just need to know what's going on so I can implement it in something else, preferably PHP or even C++, but I'll worry about that part. I just need to somehow decipher what this is doing:
$a = $ARGV[0];
$a =~ s/[^A-F0-9]+//simg;
#b = reverse split /(\S{2})/,$a;
$c = join "", #b;
$c .= "0000";
$d = hex($c) % 999999929;
print "$d\n";
What's poorly written about it? It could use a better var names, but I don't know if that's possible (since the intermediary steps don't appear to have any nameable quality), leaving only an improper use of split. The pseudo code is almost a word for word translation.
$a = $ARGV[0];
$a =~ s/[^A-F0-9]+//simg;
#b = reverse split /(\S{2})/,$a;
$c = join "", #b;
$c .= "0000";
$d = hex($c) % 999999929;
print "$d\n";
should be
$a = $ARGV[0]; # Get a hex str from cmd line E3:C9:D4
$a =~ s/[^A-F0-9]+//simg; # Remove any non-hex digits E3C9D4
#b = reverse $a =~ /(..)/sg; # Extract "bytes"; reverse D4, C9, E3
$c = join "", #b; # Join them. D4C9E3
$c .= "0000"; # Append two NULs D4C9E30000
$d = hex($c) % 999999929; # Convert from hex to number and modulus
print "$d\n"; # Print the result (in decimal).
Slightly clearer:
$a = $ARGV[0];
$a =~ s/[^0-9A-Fa-f]+//g;
$a = join '', reverse $a =~ /(..)/sg;
$a .= "0000";
$a = hex($a);
$a %= 999999929;
print "$a\n";
There might be a bug in these snippets. On a Perl with 32-bit ints, hex will overflow if the input has more than four hex digits. A Perl with 64-bit ints will handle 12 hex digits.
You seem to have taken the code from here. It's meant to take a MAC address as input, meaning the code requires 64-bit integers or Math::BigInt to work. There's no way around it since you want to modulus a 64-bit value.
Here's a concise way to do it that only works on Perls with 64-bit integers:
my $mac = $ARGV[0];
$mac =~ s/[^0-9A-Fa-f]+//g;
die length($mac) != 12;
# "123456789ABC" => 0xBC9A785634120000
my $hash = unpack('Q<', pack('H*', "0000$mac"));
$hash %= 999999929;
print "$hash\n";
For portability, you're better off integrating Math::BigInt into the earlier version.
It's looking for a bunch octets in hex concatenated together as the first argument of the program, and applying modulus.
So, if the program is invoked as:
$ myprog.pl A0B0
then the value in $c will be B0A00000. Therefore, the value of $d should be 0x396A6C8E.
It is a particularly bad piece of code written by someone who is scared of pack and unpack.
$a = $ARGV[0]; # assign first command line arg to $a
$a =~ s/[^A-F0-9]+//simg; # delete non-hex from $a
#b = reverse split /(\S{2})/,$a; # split $a by 2 non-whitespace (saving them too) to array #b and reverse it
$c = join "", #b; # join array #b to scalar $c
$c .= "0000"; # append 4 zeros to $c
$d = hex($c) % 999999929; # get modulo
print "$d\n"; # print it
$a = $ARGV[0]; #Read in the first argument on the command line
$a =~ s/[^A-F0-9]+//simg; #Substitute non hex numbers with nothing *
#b = reverse split /(\S{2})/,$a; #What is left in $a, split by 2 non-space characters
$c = join "", #b; # put the array b into $c
$c .= "0000";
$d = hex($c) % 999999929; #Convert $c to an integer and % with 999999929
print "$d\n";
simg = i: case insensitive; g: global; m: multi-line; s: single-line;
In short, we are stripping off the first hex number, then reversing the order of bytes (2 hex numbers at a time) and doing a modulo on the result.
Related
I am trying to write a big script but I am stuck on a part. I want to sprit an array based on ".."
From the script I got this:
print #coordinates;
gene complement(872..1288)
my desired output:
complement 872 1288
I tried:
1) my #answer = split(.., #coordinates)
print("#answer\n");
2) my #answer = split /../, #coordinates;
3) print +(split /\../)[-1],[-2],[-3] while <#coordinates>
4) foreach my $anwser ( #coordinates )
{$anwser =~ s/../"\t"/;
print $anwser;}
5) my #answer = split(/../, "complement(872..1288)"); #to see if the printed array is problematic.
which prints:
) ) ) ) ) ) ) ) )
6) my #answer = split /"gene "/, #coordinates; # I tried to "catch" the entire output's spaces and tabs
which prints
0000000000000000000000000000000001000000000100000000
But none of them works. Does anyone has any idea how to step over this issue?
Ps, unfortunately, I can't run my script right now on Linux so I used this website to run my script. I hope this is not the reason why I didn't get my desired output.
my $RE_COMPLEMENT = qr{(complement)\((\d+)\.\.(\d+)\)}msx;
for my $item (#coordinates) {
my ($head, $i, $j) = $item =~ $RE_COMPLEMENT;
if (defined($head) && defined($i) && defined($j)) {
print("$head\t$i\t$j\n");
}
}
split operates on a scalar, not on an array.
my $string = 'gene complement(872..1288)';
my #parts = split /\.\./, $string;
print $parts[0]; # gene complement(872
print $parts[1]; # 1288)
To get the desired output, you can use a substitution:
my $string = 'gene complement(872..1288)';
$string =~ s/gene +|\)//g;
$string =~ s/\.\./ /;
$string =~ s/\(/ /;
Desired effect can be achieved with
use of tr operator to replace '(.)' => ' '
then splitting data string into element on space
storing only required part of array
output elements of array joined with tabulation
use strict;
use warnings;
use feature 'say';
my $data = <DATA>;
chomp $data;
$data =~ tr/(.)/ /;
my #elements = (split ' ', $data)[1..3];
say join "\t", #elements;
__DATA__
gene complement(872..1288)
Or as an alternative solution with only substitutions (without splitting data string into array)
use strict;
use warnings;
use feature 'say';
my $data = <DATA>;
chomp $data;
$data =~ s/gene\s+//;
$data =~ s/\)//;
$data =~ s/[(.]+/\t/g;
say $data;
__DATA__
gene complement(872..1288)
Output
complement 872 1288
The output for the command is ent3, and from that output I want 3 to be stored in a variable
Perl code
sub {
if ( $exit == 1 )
{
$cmdStr = "lsdev | grep en | grep VLAN | awk '{ print \$1 }'\r";
$result =_run_cmd($cmdStr);
my #PdAt_val = split("\r?\n", $result);
my $num = $result =~ /([0-9]+)/;
print "The char is $num\n";
$exit = 0;
exp_continue;
Tidied code
sub {
if ( $exit == 1 ) {
$cmdStr = "lsdev | grep en | grep VLAN | awk '{ print \$1 }'\r";
$result = _run_cmd($cmdStr);
my #PdAt_val = split("\r?\n", $result);
my $num = $result =~ /([0-9]+)/;
print "The char is $num\n";
$exit = 0;
exp_continue;
Your code that is doing the work here is:
my $num = $result =~ /([0-9]+)/;
Let's put that into a simple program so we can see what's going on.
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
my $result = 'ext3';
my $num = $result =~ /([0-9]+)/;
say $num;
And that prints 1. Which isn't what we want. What's going on?
Well, if you read the documentation for the match operator (in the section Regexp Quote-Like Operators in "perlop"), you'll see what the operator returns under different circumstances. It says:
Searches a string for a pattern match, and in scalar context returns true if it succeeds, false if it fails.
So that explains the behaviour we're seeing. That "1" is just a true value saying that the match succeeded. But how do we get the value that we have captured in our parentheses. There are a couple of ways. Firstly, it's written into the $1 variable.
my $num;
if ($result =~ /([0-9]+)/) {
$num = $1;
}
say $num;
But I think the other approach is what you were looking for. If you read on, you'll see what the operator returns in list context:
m// in list context returns a list consisting of the subexpressions matched by the parentheses in the pattern, that is, ($1, $2, $3 ...)
So if we put the match operator in list context, then we'll get the contents of $1 returned. How do we put a match into list context? By making the expression a list assignment - which we can do by putting parentheses around the left-hand side of the assignment.
my ($num) = $result =~ /([0-9]+)/;
say $num;
Using regex, something like this should work:
if($result =~ /([0-9]+)/) {
$num = $1;
}
print $num;
Why perl is giving different result below?
$a = "bar";
$a =~ tr/abc/ABC/d;
print "[$a]\n"; # prints "BAr" as expected.
$x = "bar";
$y = "abc";
$z = "ABC";
$x =~ tr/\\Q$y\\E/\\Q$z\\E/d;
print "[$x]\n"; # prints "bar" to my surprise.
Similarly one more test case as below:
$p = "--aaa--";
$q = "abc-";
$r = "ABC";
$p =~ tr/\\Q$q\\E/\\Q$r\\E/d;
print "[$p]\n"; # prints "--aaa--" surprisingly.
$s = "--aaa--";
$s =~ tr/abc-/ABC/d;
print "[$s]\n"; # prints "AAA" as expected.
Can some body please explain this behaviour?
Best Regards,
Mohammad S Anwar
tr/// does not interpolate.
tr/\\Q$y\\E/\\Q$z\\E/d translates
\ to \
Q to Q
$ to $
y to z
E to E
$x does not contain any of those characters, so it remains unchanged.
tr/\\Q$q\\E/\\Q$r\\E/d translates
\ to \
Q to Q
$ to $
q to r
E to E
$p does not contain any of those characters, so it remains unchanged.
To get the desired behaviour, you could use
my %tr;
#tr{ split(//, $fr) } = split(//, $to);
$_ //= '' for values %tr;
my $re = '['.( join '', map quotemeta, keys %tr ).']';
$s =~ s/($re)/$tr{$1}/g;
or
eval "\\$s =~ tr/\Q$fr\E/\Q$to\E/d";
die $# if $#;
What does ".=" mean in Perl (dot-equals)? Example code below (in the while clause):
if( my $file = shift #ARGV ) {
$parser->parse( Source => {SystemId => $file} );
} else {
my $input = "";
while( <STDIN> ) { $input .= $_; }
$parser->parse( Source => {String => $input} );
}
exit;
Thanks for any insight.
The period . is the concatenation operator. The equal sign to the right means that this is an assignment operator, like in C.
For example:
$input .= $_;
Does the same as
$input = $input . $_;
However, there's also some perl magic in this, for example this removes the need to initialize a variable to avoid "uninitialized" warnings. Try the difference:
perl -we 'my $x; $x = $x + 1' # Use of uninitialized value in addition ...
perl -we 'my $x; $x += 1' # no warning
This means that the line in your code:
my $input = "";
Is quite redundant. Albeit some people might find it comforting.
For pretty much any binary operator X, $a X= $b is equivalent to $a = $a X $b. The dot . is a string concatenation operator; thus, $a .= $b means "stick $b at the end of $a".
In your code, you start with an empty $input, then repeatedly read a line and append it to $input until there's no lines left. You should end up with the entire file as the contents of $input, one line at a time.
It should be equivalent to the loopless
local $/;
$input = <STDIN>;
(define line separator as a non-defined character, then read until the "end of line" that never comes).
EDIT: Changed according to TLP's comment.
You have found the string concatenation operator.
Let's try it :
my $string = "foo";
$string .= "bar";
print $string;
foobar
This performs concatenation to the $input var. Whatever is coming in via STDIN is being assigned to $input.
In perl, I have to determine whether user input is a palindrome or not and it must display like this:
Enter in 7 characters: ghghghg #one line here #
Palindrome! #second line answer#
But instead this is what it does:
Enter in 7 characters: g #one line#
h #second line#
g #third line#
h #fourth line#
g #fifth line#
h #sixth line#
g Palindrom
e! #seventh line#
My problem seems to be on the chomp lines with all the variables but I just can't figure out what to do and I've been at if for hours. I need a simple solution, but have not progressed to arrays yet so need some simple to fix this. Thanks
And here is what i have so far, the formula seems to work but it keeps printing a new line for each character:
use strict;
use warnings;
my ($a, $b, $c, $d, $e, $f, $g);
print "Enter in 7 characters:";
chomp ($a = <>); chomp ($b = <>); chomp ($c = <>); chomp ($d = <>); chomp ($e = <>); chomp ($f = <>); chomp ($g = <>);
if (($a eq $g) && ($b eq $f) && ($c eq $e) && ($d eq $d) && ($e eq $c) && ($f eq $b) && ($g eq $a))
{print "Palindrome! \n";}
else
{print "Not Palindrome! \n";}
If you're going to determine if a word is the same backwards, may I suggest using reverse and lc?
chomp(my $word = <>);
my $reverse = reverse $word;
if (lc($word) eq lc($reverse)) {
print "Palindrome!";
} else {
print "Not palindrome!";
}
Perl is famous for its TIMTOWTDI. Here are two more ways of doing it:
print "Enter 7 characters: ";
chomp(my $i= <STDIN>);
say "reverse: ", pal_reverse($i) ? "yes" : "no";
say "regex: ", pal_regex($i) ? "yes" : "no";
sub pal_reverse {
my $i = (#_ ? shift : $_);
return $i eq reverse $i;
}
sub pal_regex {
return (#_ ? shift() : $_) =~ /^(.?|(.)(?1)\2)$/ + 0;
}
use strict;
use warnings;
use feature 'say';
print "Please enter 7 characters : ";
my $input = <>; # Read in input
chomp $input; # To remove trailing "\n"
# Season with input validation
warn 'Expected 7 characters, got ', length $input, ' instead'
unless length $input == 7;
# Determine if it's palindromic or not
say $input eq reverse $input
? 'Palindrome'
: 'Not palindrome' ;
TIMTOWTDI for the recursion-prone:
sub is_palindrome {
return 1 if length $_[0] < 2; # Whole string is palindromic
goto \&is_palindrome
if substr $_[0], 0, 1, '' eq substr $_[0], -1, 1, ''; # Check next chars
return; # Not palindromic if we reach here
}
say is_palindrome( 'ghghghg' ) ? 'Palindromic' : 'Not palindromic' ;
And perldoc perlretut for those who aren't :)
Recursive patterns
This feature (introduced in Perl 5.10) significantly extends the power
of Perl's pattern matching. By referring to some other capture group
anywhere in the pattern with the construct (?group-ref), the pattern
within the referenced group is used as an independent subpattern in
place of the group reference itself. Because the group reference may
be contained within the group it refers to, it is now possible to
apply pattern matching to tasks that hitherto required a recursive
parser.
To illustrate this feature, we'll design a pattern that matches if a
string contains a palindrome. (This is a word or a sentence that,
while ignoring spaces, interpunctuation and case, reads the same
backwards as forwards. We begin by observing that the empty string or
a string containing just one word character is a palindrome. Otherwise
it must have a word character up front and the same at its end, with
another palindrome in between.
/(?: (\w) (?...Here be a palindrome...) \g{-1} | \w? )/x
Adding \W* at either end to eliminate what is to be ignored, we
already have the full pattern:
my $pp = qr/^(\W* (?: (\w) (?1) \g{-1} | \w? ) \W*)$/ix;
for $s ( "saippuakauppias", "A man, a plan, a canal: Panama!" ){
print "'$s' is a palindrome\n" if $s =~ /$pp/;
}