I am trying to convert a line a, f_1(b, c, f_2(d, e)) to a line with lisp style function call
a (f_1 b c (f_2 d e)) using Text::Balanced subroutines:
A function call is in the form f(arglist), arglist can have one or many function calls within it with heirarchical calls too;
The way i tried -
my $text = q|a, f_1(a, b, f_2(c, d))|;
my ($match, $remainder) = extract_bracketed($text); # defaults to '()'
# $match is not containing the text i want which is : a, b, f_2(c,d) because "(" is preceded by a string;
my ($d_match, $d_remainder) = extract_delimited($remainder,",");
# $d_match doesnt contain the the first string
# planning to use remainder texts from the bracketed and delimited operations in a loop to translate.
Tried even the sub extract_tagged with start tag as /^[\w_0-9]+\(/ and end tag as /\)/, but doesn't work there too.
Parse::RecDescent is difficult to understand and put to use in a short time.
All that seems to be necessary to transform to the LISP style is to remove the commas and move each opening parenthesis to before the function names that precedes it.
This program works by tokenizing the string into identifiers /\w+/ or parentheses /[()]/ and storing the list in array #tokens. This array is then scanned, and wherever an identifier is followed by an opening parenthesis the two are switched over.
use strict;
use warnings;
my $str = 'a, f_1(b, c, f_2(d, e))';
my #tokens = $str =~ /\w+|[()]/g;
for my $i (0 .. $#tokens-1) {
#tokens[$i,$i+1] = #tokens[$i+1,$i] if "#tokens[$i,$i+1]" =~ /\w \(/;
}
print "#tokens\n";
output
a ( f_1 b c ( f_2 d e ) )
Related
I've encountered a piece of code in a book that looks like this:
#for (some_condition) {
#do something not particularly related to the question
$var = $anotherVar+1, next if #some other condition with $var
#}
I've got no clue what does the comma (",") between $anotherVar+1 and before next do. How is this syntax construction called and is it even correct?
The comma operator is described in perlop. You can use it to separate commands, it evaluates its left operand first, then it evaluates the second operand. In this case, the second operand is next which changes the flow of the program.
Basically, this is a shorter way of writing
if ($var eq "...") {
$var = $anotherVar + 1;
next
}
The comma can be used in a similar way in C, where you can find it often in for loops:
for (i = 0, j = 10; i < 10; i++, j--)
The comma is an operator, in any context. In list context where it's usually seen, it is one way to concatenate values into a list. Here it is being used in scalar context, where it runs the preceding expression, the following expression, and then returns the following expression. This is a holdover from how it works in C and similar languages, when it's not an argument separator; see https://learn.microsoft.com/en-us/cpp/cpp/comma-operator?view=vs-2019.
my #stuff = ('a', 'b'); # comma in list context, forms a list and assigns it
my $stuff = ('a', 'b'); # comma in scalar context, assigns "b"
my $stuff = 'a', 'b'; # assignment has higher precedence
# assignment done first then comma operator evaluated in greater context
Consider the following code:
$x=1;
$y=1;
$x++ , $y++ if 0; # note the comma! both x and y are one statement
print "With comma: $x $y\n";
$x=1;
$y=1;
$x++ ; $y++ if 0; # note the semicolon! so two separate statements
print "With semicolon: $x $y\n";
The output is as follows:
With comma: 1 1
With semicolon: 2 1
A comma is similar to a semicolon, except that both sides of the command are treated as a single statement. This means that in a situation where only one statement is expected, both sides of the comma are evaluated.
So I have this bit of code that does not work:
print $userInput."\n" x $userInput2; #$userInput = string & $userInput2 is a integer
It prints it out once fine if the number is over 0 of course, but it doesn't print out the rest if the number is greater than 1. I come from a java background and I assume that it does the concatenation first, then the result will be what will multiply itself with the x operator. But of course that does not happen. Now it works when I do the following:
$userInput .= "\n";
print $userInput x $userInput2;
I am new to Perl so I'd like to understand exactly what goes on with chaining, and if I can even do so.
You're asking about operator precedence. ("Chaining" usually refers to chaining of method calls, e.g. $obj->foo->bar->baz.)
The Perl documentation page perlop starts off with a list of all the operators in order of precedence level. x has the same precedence as other multiplication operators, and . has the same precedence as other addition operators, so of course x is evaluated first. (i.e., it "has higher precedence" or "binds more tightly".)
As in Java you can resolve this with parentheses:
print(($userInput . "\n") x $userInput2);
Note that you need two pairs of parentheses here. If you'd only used the inner parentheses, Perl would treat them as indicating the arguments to print, like this:
# THIS DOESN'T WORK
print($userInput . "\n") x $userInput2;
This would print the string once, then duplicate print's return value some number of times. Putting space before the ( doesn't help since whitespace is generally optional and ignored. In a way, this is another form of operator precedence: function calls bind more tightly than anything else.
If you really hate having more parentheses than strictly necessary, you can defeat Perl with the unary + operator:
print +($userInput . "\n") x $userInput2;
This separates the print from the (, so Perl knows the rest of the line is a single expression. Unary + has no effect whatsoever; its primary use is exactly this sort of situation.
This is due to precedence of . (concatenation) operator being less than the x operator. So it ends up with:
use strict;
use warnings;
my $userInput = "line";
my $userInput2 = 2;
print $userInput.("\n" x $userInput2);
And outputs:
line[newline]
[newline]
This is what you want:
print (($userInput."\n") x $userInput2);
This prints out:
line
line
As has already been mentioned, this is a precedence issue, in that the repetition operator x has higher precedence than the concatenation operator .. However, that is not all that's going on here, and also, the issue itself comes from a bad solution.
First off, when you say
print (($foo . "\n") x $count);
What you are doing is changing the context of the repetition operator to list context.
(LIST) x $count
The above statement really means this (if $count == 3):
print ( $foo . "\n", $foo . "\n", $foo . "\n" ); # list with 3 elements
From perldoc perlop:
Binary "x" is the repetition operator. In scalar context or if the left operand is not enclosed in parentheses, it returns a string consisting of the left operand repeated the number of times specified by the right operand. In list context, if the left operand is enclosed in parentheses or is a list formed by qw/STRING/, it repeats the list. If the right operand is zero or negative, it returns an empty string or an empty list, depending on the context.
The solution works as intended because print takes list arguments. However, if you had something else that takes scalar arguments, such as a subroutine:
foo(("text" . "\n") x 3);
sub foo {
# #_ is now the list ("text\n", "text\n", "text\n");
my ($string) = #_; # error enters here
# $string is now "text\n"
}
This is a subtle difference which might not always give the desired result.
A better solution for this particular case is to not use the concatenation operator at all, because it is redundant:
print "$foo\n" x $count;
Or even use more mundane methods:
for (0 .. $count) {
print "$foo\n";
}
Or
use feature 'say'
...
say $foo for 0 .. $count;
I am trying to replace a string with another string, but the greedy nature doesn't seem to be working for me. Below is my code where "PERFORM GET-APLCY" is identified and replaced properly, but string "PERFORM GET-APLCY-SOI-CVG-WVR" and many other such strings are being replaced by the the replacement string for "PERFORM GET-APLCY".
s/PERFORM $func[$i]\.*/# PERFORM $func[$i]\.\n $hash{$func[$i]}/g;
where the full stop is optional during string match and replacement. I have also tried giving the pattern to be matched as $func[$i]\b
Please help me understand what the issue could be.
Thanks in advance,
Faez
Why GET-APLCY- should not match GET-APLCY., if the dot is optional?
Easy solution: sort your array by length in descending order.
#func = sort { length $b <=> length $a } #func
Testing script:
#!/usr/bin/perl
use warnings;
use strict;
use feature 'say';
my %hash = ('GET-APLCY' => 'REP1',
'GET-APLCY-SOI-CVG-WVR' => 'REP2',
'GET-APLCY-SOI-MNG-CVRW' => 'REP3',
);
my #func = sort { length $b <=> length $a } keys %hash;
while (<DATA>) {
chomp;
print;
print "\t -> \t";
for my $i (0 .. $#func) {
s/$func[$i]/$hash{$func[$i]}/;
}
say;
}
__DATA__
GET-APLCY param
GET-APLCY- param
GET-APLCY. param
GET-APLCY-SOI. param
GET-APLCY-SOI-CVG-WVR param
GET-APLCY-SOI-MNG-CVRW param
You appear to be looping over function names, and calling s/// for each one. An alternative is to use the e option, and do them all in one go (without a loop):
my %hash = (
'GET-APLCY' => 'replacement 1',
'GET-APLCY-SOI-CVG-WVR' => 'replacement 2',
);
s{
PERFORM \s+ # 'PERFORM' keyword
([A-Z-]+) # the original function name
\.? # an optional period
}{
"# PERFORM $1.\n" . $hash{$1};
}xmsge;
The e causes the replacement part to be evaluated as an expression. Basically, the first part finds all PERFORM calls (I'm assuming that the function names are all upper case with '-' between them – adjust otherwise). The second part replaces that line with the text you want to appear.
I've also used the x, m, and s options, which is what allows the comments in the regular expression, among other things. You can find more about these under perldoc perlop.
A plain version of the s-line should be:
s/PERFORM ([A-Z-]+)\.?/"# PERFORM $1.\n" . $hash{$1}/eg;
I guess that $func[$i] contains "GET-APLCY". If so, this is because the star only applies to the dot, an actual dot, not "any character". Try
s/PERFORM $func[$i].*/# PERFORM $func[$i]\.\n $hash{$func[$i]}/g;
I'm pretty sure you trying to do some kind of loop for $i. And in that case most likely
GET-APLCY is located in #func array before GET-APLCY-SOI-CVG-WVR. So I recommend to reverse sort #func before entering loop.
I need to apply a regexp filtration to affect only pieces of text within quotes and I'm baffled.
$in = 'ab c "d e f" g h "i j" k l';
#...?
$inquotes =~ s/\s+/_/g; #arbitrary regexp working only on the pieces inside quote marks
#...?
$out = 'ab c "d_e_f" g h "i_j" k l';
(the final effect can strip/remove the quotes if that makes it easier, 'ab c d_e_f g...)
You could figure out some cute trick that looks like line noise.
Or you could keep it simple and readable, and just use split and join. Using the quote mark as a field separator, operate on every other field:
my #pieces = split /\"/, $in, -1;
foreach my $i (0 ... $#pieces) {
next unless $i % 2;
$pieces[$i] =~ s/\s+/_/g;
}
my $out = join '"', #pieces;
If you want you use just a regex, the following should work:
my $in = q(ab c "d e f" g h "i j" k l);
$in =~ s{"(.+?)"}{$1 =~ s/\s+/_/gr}eg;
print "$in\n";
(You said the "s may be dropped :) )
HTH,
Paul
Something like
s/\"([\a\w]*)\"/
should match the quoted chunks. My perl regex syntax is a little rusty, but shouldn't just placing quote literals around what you're capturing do the job? You've then got your quoted string d e f inside the first capture group, so you can do whatever you want to it... What kind of 'arbitrary operation' are you trying to do to the quoted strings?
Hmm.
You might be better off matching the quoted strings, then passing them to another regex, rather than doing it all in one.
I'm trying to use an array and a loop to print out the following (basically for each letter of the alphabet, print each letter of the alphabet after it and then move on to the next letter). I'm new to perl, anyone have any quick words of :
aa
ab
ac
ad
...
ba
bb
bc
bd
...
ca
cb
...
Currently I have this, but it only prints a single character alphabet...
#arr = ("a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z");
$i = #arr;
while ($i)
{
print $arr[$i];
$i--;
}
Using the range operator and the ranges you want to target:
use strict;
use warnings;
my #elements = ("aa" .. "zz");
for my $combo (#elements)
{
print "$combo\n";
}
You can utilize the initial 2 letters till the ending 2 letters you want as ending and the for will take care of everything.
This really isn't multi-dimensional array work, if it were you'd be working with stuff like:
my #foo = (
[1,2,3],
[4,7,8,1,2,3],
[2,3],
);
This is really a very basic how do I make a nested loop that iterates over the same array. I'll bet this is homework.
So, I'll let you figure out the nesting bits, but give some help with Perl's loop operators.
!! for/foreach
for (the each is optional) is the real heavy hitter for looping in perl. Use it like so:
for my $var ( #array ) {
#do stuff with $var
}
Each element in #array will be aliased to the $var variable, and the block of code will be executed. The fact that we are aliasing, rather than copying means that if alter the value of $var, #array will be changed as well. The stuff between the parenthesis may be any expression. The expression will be evaluated in list context. So if you put a file handle in the parens, the entire file will be read into memory and processed.
You can also leave off naming the loop variable, and $_ will be used instead. In general, DO NOT DO THIS.
!! C-Style for
Every once in a while you need to keep track of indexes as you loop over an array. This is when a C style for loop comes in handy.
for( my $i=0; $i<#array; $i++ ) {
# do stuff with $array[$i]
}
!! While/Until
While and until operate with boolean loop conditions. That means that the loop will repeat as long as the appropriate boolean value if found for the condition ( TRUE for while, and FALSE for until). In addition to the obvious cases where you are looking for a particular condition, while is great for processing a file one line at a time.
while ( my $line = <$fh> ) {
# Do stuff with $line.
}
!! map
map is an amazingly useful bit of functional programming kung-fu. It is used to turn one list into another. You pass an anonymous code reference that is used to enact the transformation.
# Multiply all elements of #old by two and store them in #new.
my #new = map { $_ * 2 } #old;
So how do you solve your particular problem? There are many ways. Which is best depends on how you want to use the results. If you want to create a new array of the letter pairs, use map. If you are interested primarily in a side effect (say printing a variable) use for. If you need to work with really big lists that come from sort of interator (like lines from a filehandle) use while.
Here's a solution. I wouldn't turn it in to your professor until you understand how it works.
print map { my $letter=$_; map "$letter$_\n", "a".."z" } "a".."z";
Look at perldoc articles, perlsyn for info on the looping constructs, perlfunc for info on map and look at perlop for info on the range operator (..).
Good luck.
Use the range operator (..) for your initialization. The range operator basically grabs a range of values such as numbers or characters.
Then use a nested loop to go through the array one time per character for a total of 26^2 iterations.
Rather than a while loop I've used a foreach loop to go through each item in the array. You could also put 'a' .. 'z' instead of declared #arr as the argument to the foreach loop. The foreach loops below set $char or $char2 to each value in #arr in turn.
my #arr = ('a' .. 'z');
for my $char (#arr) {
for my $char2 (#arr) {
print "$char$char2\n";
}
}
If all you really want to do is print the 676 strings you describe, then:
#!/usr/bin/perl
use warnings;
use strict;
my $str = 'aa';
while (length $str < 3) {
print $str++, "\n";
}
But I smell an "XY problem"...