Perl6 one liner execution. How is the topic updated? - command-line

Executing the one liner to process CSV a line at a time from stdin:
perl6 -ne 'my #a; $_.split(",").kv.map: {#a[$^k]+=$^v}; say #a; ENTER {say "ENTER"}; BEGIN {say "BEGIN"}; LEAVE {say "LEAVE"}; END {say "END"}';
Typing in:
1,1
1,1
^D
Gives the following output:
BEGIN
ENTER
1,1
[1 1]
1,1
[2 2]
LEAVE
END
Here we can see that the one liner is not a block executed multiple times as the ENTER and LEAVE phaser are only executed once.
This makes sense as the variable #a is accumulating. If the one liner was a block the value of #a would be reset each time.
My question is how does the topic variable $_ get updated? The topic variable is a Str (at least that's what $_.^name says). How does its value update without re-entering the block?
What am I missing?

When you add -n it adds a for loop around your code.
You think it adds one like this:
for lines() {
# Your code here
}
The compiler just adds the abstract syntax tree nodes for looping without actually adding a block.
(
# Your code here
) for lines()
(It could potentially be construed as a bug.)
To get it to work like the first one:
( # -n adds this
-> $_ { # <-- add this
# Your code here
}( $_ ) # <-- add this
) for lines() # -n adds this
I tried just adding a bare block, but the way the compiler adds the loop causes that to not work.
In general ENTER and LEAVE are scoped to a block {}, but they are also scoped to the “file” if there isn't a block.
ENTER say 'ENTER file';
LEAVE say 'LEAVE file';
{
ENTER say ' ENTER block';
LEAVE say ' LEAVE block';
}
ENTER file
ENTER block
LEAVE block
LEAVE file
Since there is no block in your code, everything is scoped to the “file”.

The -n command line argument puts a loop around your program,
for $*ARGFILES.lines {
# Program block given on command line
}
whereas the program execution phasers you used (BEGIN and END), are run once either at compile time or after the program block has finished, so they will not be part of the loop at run time.
The ENTER block phaser will run at every block entry time, whereas the
the LEAVE block phaser will run at every block exit time. So these phasers will be run for each line read in the for loop.

Update -- Rakudo 2020.10
Running your original accumulator code (using the -ne linewise flag) gives the following result. Note how the word "final" appears in every line:
~$ perl6 -ne 'my #a; $_.split(",").kv.map: {#a[$^k]+=$^v}; say #a, " final"; ENTER {say "ENTER"}; BEGIN {say "BEGIN"}; LEAVE {say "LEAVE"}; END {say "END"};' drclaw.txt
BEGIN
ENTER
[1 1] final
[2 3] final
[3 6] final
LEAVE
END
Below, running essentially duplicate scripts back-to-back with the -ne flag gives an interesting result. BEGIN, ENTER,LEAVE, and END show up in the exact same location, duplicated on the order of once-per-call:
~$ perl6 -ne 'my #a; .split(",").kv.map: {#a[$^k]+=$^v}; say #a, " final_a"; ENTER {say "ENTER"}; BEGIN {say "BEGIN"}; LEAVE {say "LEAVE"}; END {say "END"}; my #b; .split(",").kv.map: {#b[$^k]+=$^v}; say #b, " final_b"; ENTER {say "ENTER"}; BEGIN {say "BEGIN"}; LEAVE {say "LEAVE"}; END {say "END"};' drclaw.txt
BEGIN
BEGIN
ENTER
ENTER
[1 1] final_a
[1 1] final_b
[2 3] final_a
[2 3] final_b
[3 6] final_a
[3 6] final_b
LEAVE
LEAVE
END
END
However, removing the -ne flag below lets you run a for lines() {...} loop within the Raku code itself (single script, not duplicated back-to-back). This result seems more in line with what you were expecting:
~$ perl6 -e 'my #a; for lines() {.split(",").kv.map: {#a[$^k]+=$^v};}; say #a, " final"; ENTER {say "ENTER"}; BEGIN {say "BEGIN"}; LEAVE {say "LEAVE"}; END {say "END"};' drclaw.txt
BEGIN
ENTER
[3 6] final
LEAVE
END
I think the short answer to your questions is that Phasers respect Block/Loop semantics, but are limited script-wise as to how many times they will report back to the implementer (apparently only once per call). But the ultimate difference is that the return to the user is linewise for the -ne command line flag, as compared to an internal for lines() {...} loop sans the -ne command line flag.
Finally, you can always force the reloading of the $_ topic variable with the andthen infix operator. Maybe this is what you were looking for all along:
~$ perl6 -e 'my #a; for lines() {.split(",").kv.map: {#a[$^k]+=$^v} andthen $_.say }; say #a, " final"; ENTER {say "ENTER"}; BEGIN {say "BEGIN"}; LEAVE {say "LEAVE"}; END {say "END"};' drclaw.txt
BEGIN
ENTER
(1 1)
(2 3)
(3 6)
[3 6] final
LEAVE
END
[Test file under analysis, below].
~$ cat drclaw.txt
1,1
1,2
1,3
https://docs.raku.org/language/operators#index-entry-andthen

Related

In Perl can i apply 'grep' command on the data i captured using flip-flop operator directly?

I need to find the 'number' of occurrences of particular words (C7STH, C7ST2C) that come in the output of a command. The command starts and ends with a 'fixed' text - START & END like below. This command is repeated many times against different nodes in the log file.
...
START
SLC ACL PARMG ST SDL SLI
0 A1 17 C7STH-1&&-31 MSC19-0/RTLTB2-385
1 A1 17 C7STH-65&&-95 MSC19-0/RTLTB2-1697
SLC ACL PARMG ST SDL SLI
0 A2 0 C7ST2C-4 ETRC18-0/RTLTB2-417
1 A2 0 C7ST2C-5 ETRC18-0/RTLTB2-449
2 A2 0 C7ST2C-6 ETRC18-0/RTLTB2-961
...
END
....
I am using flip-flop operator (if (/^START$/ .. /^END$/)to get each command output. Now
Is there a way to do 'grep' on this data without going line by line? Like can i get all the text between 'START' and 'END' into an array and do 'grep' on this etc?
Also is it 'ok' to have multiple levels of if blocks with flip-flop operator from performance point of view?
This would be a simple solution:
my $number = grep {/particular word/} grep {/START/../END/} <>;
(Since you don't provide a code sample I've used the diamond operator and assumed the log file is passed as an argument to the script. Replace with file handle if needed.)
grep {/START/../END/} <> creates a list of elements within and including the delimiters, and grep {/particular word/} works on that list.
From a performance point of view you'd be better off with
for (<>) {
$number++ if /START/../END/ and /a/;
}
Note that you have to use and instead of && or wrap your flip-flop expression in parentheses because of operator precedence.
Combining both:
my $number = grep {/START/../END/ and /particular word/} <>;
Maybe your looking for something along these lines:
#!/usr/bin/env perl
use strict;
use warnings;
my $word = q(stuff);
my #data;
while (<DATA>) {
if ( /^START/../^END/ ) {
chomp;
push #data, $_ unless /^(?:START|END)/;
}
if ( /^END/ ) {
my $str = "#data";
print +(scalar grep {/$word/} (split / /,$str)),
" occurances of '$word'\n";
#data = ();
}
}
__DATA__
this is a line
START of my stuff
more my stuff
and still more stuff
and lastly, yet more stuff
END of my stuff
this is another line
START again
stuff stuff stuff stuff
yet more stuff
END again
...which would output:
3 occurances of 'stuff'
5 occurances of 'stuff'
Like can i get all the text between 'START' and 'END' into an array and do 'grep' on this etc?
(push #ar,$_) if /START/ .. /END/;
grep {/word/ #ar};
Also is it 'ok' to have multiple levels of if blocks with flip-flop operator from performance point of view?
As long as you are not working for NASA.

remove elements from file using Perl

Input.txt
CASE
REPEAT 1 TIMES
ENDREPEAT
ENDCASE
REPEAT
ENDREPEAT
CASE
REPEAT 2 TIMES
ENDREPEAT
ENDCASE
code.pl
open (FH, "input.txt");
my #arr = <FH>;
foreach (#arr) {
if ($_ =~ s/ENDCASE.*?CASE//gsi) {
$_ = s/ENDCASE.*?CASE//gsi;
}
}
print #arr;
Output : perl code.pl
It prints the Array without modifying........
CASE
REPEAT 1 TIMES
ENDREPEAT
ENDCASE
REPEAT ===> To be Removed
ENDREPEAT ===> To be Removed
CASE
REPEAT 2 TIMES
ENDREPEAT
ENDCASE
Output Needed is, ***||||||||||||****
CASE
REPEAT 1 TIMES
ENDREPEAT
ENDCASE
************Content Removed*****************
CASE
REPEAT 2 TIMES
ENDREPEAT
ENDCASE
Please Guide me to get this output.
Thanks in advance.........
This can be done through the command line as well à la flip-flop operator.
To just output the result to screen
$ perl -ne 'print if /^CASE/ .. /^ENDCASE/' Input.txt
To direct the output to another file
$ perl -ne 'print if /^CASE/ .. /^ENDCASE/' Input.txt > output.txt
To modify the file in-place
$ perl -ni.bak -e 'print if /^CASE/ .. /^ENDCASE/' Input.txt
Replace ' (single-quotes) with "(double-quotes) if on Windows.
You've got a couple of suggestions of ways to address your problem, but you might be interested to hear why your solution didn't work. There are a couple of reasons.
Firstly, When you read your file into #arr you get one line of the file in each element of the array. And when you process the array an element at at time, no element contains both ENDCASE and CASE so your regex never matches and nothing is changed.
For your approach to work, you need to rewrite the program to process the whole file in one go. (I've also cleaned up your code a little.)
#!/usr/bin/perl
use strict;
use warnings;
open (my $fh, '<', 'input.txt') or die $!;
my $file = do { local $/; <$fh> };
$file =~ s/ENDCASE.*?CASE//gsi;
print $file;
But this doesn't fix the problem. It gives the output:
CASE
REPEAT 1 TIMES
ENDREPEAT
REPEAT 2 TIMES
ENDREPEAT
ENDCASE
That's because the ENDCASE and CASE are included in your regex so they get removed. You'll need to look at lookahead and lookbehind assertions in perlre to fix this issue. I'll leave that as an exercise for the reader.
Tie your file using Tie::File:
tie #array, 'Tie::File', filename or die ...;
Manipulate the lines, in any way you see fit, and then untie the array:
untie #array;
Thus, your modifications will be reflected in the original file.
Here's a weird idea that just might work.
use English qw<$INPLACE_EDIT $RS>;
$INPLACE_EDIT = '.bak';
local $RS = "CASE\n";
while ( <$input> ) {
print(( !/^(END)?CASE\n\z/ms or $1 ) ? $_ : $RS );
}
The idea is that you break up your records not by newlines, but by CASE + \n and thus you get to treat all the lines between an ENDCASE and a CASE as one record that you can simply replace with "CASE\n".
Note that we simply print the record unless we see a line start before 'ENDCASE' or 'CASE' followed by a newline. So even though we make a pretty brittle assumption when breaking up the records, we check our assumption before modifying the record. Also if it matches "ENDCASE\n" then $1 is 'END' and we print that record unmodified.
This can break, though. If for some reason you were capable of having a comment here:
ENDCASE
REPEAT ===> This prints because it ends with CASE
ENDREPEAT
CASE
Then the first line would be printed. So we could do this:
my $match = 0;
my $old_1;
while ( <$input> ) {
if ( m/^(END)?CASE\n\z/ms and not $1 ) {
print $RS;
}
else {
next if $old_1;
print;
}
$old_1 = $1;
}

Perl: extract rows from 1 to n (Windows)

I want to extract rows 1 to n from my .csv file. Using this
perl -ne 'if ($. == 3) {print;exit}' infile.txt
I can extract only one row. How to put a range of rows into this script?
If you have only a single range and a single, possibly concatenated input stream, you can use:
#!/usr/bin/perl -n
if (my $seqno = 1 .. 3) {
print;
exit if $seqno =~ /E/;
}
But if you want it to apply to each input file, you need to catch the end of each file:
#!/usr/bin/perl -n
print if my $seqno = 1 .. 3;
close ARGV if eof || $seqno =~ /E/;
And if you want to be kind to people who forget args, add a nice warning in a BEGIN or INIT clause:
#!/usr/bin/perl -n
BEGIN { warn "$0: reading from stdin\n" if #ARGV == 0 && -t }
print if my $seqno = 1 .. 3;
close ARGV if eof || $seqno =~ /E/;
Notable points include:
You can use -n or -p on the #! line. You could also put some (but not all) other command line switches there, like ‑l or ‑a.
Numeric literals as
operands to the scalar flip‐flop
operator are each compared against
readline counter, so a scalar 1 ..
3 is really ($. == 1) .. ($. ==
3).
Calling eof with neither an argument nor empty parens means the last file read in the magic ARGV list of files. This contrasts with eof(), which is the end of the entire <ARGV> iteration.
A flip‐flop operator’s final sequence number is returned with a "E0" appended to it.
The -t operator, which calls libc’s isatty(3), default to the STDIN handle — unlike any of the other filetest operators.
A BEGIN{} block happens during compilation, so if you try to decompile this script with ‑MO=Deparse to see what it really does, that check will execute. With an INIT{}, it will not.
Doing just that will reveal that the implicit input loop as a label called LINE that you perhaps might in other circumstances use to your advantage.
HTH
What's wrong with:
head -3 infile.txt
If you really must use Perl then this works:
perl -ne 'if ($. <= 3) {print} else {exit}' infile.txt
You can use the range operator:
perl -ne 'if (1 .. 3) { print } else { last }' infile.txt

How does this obfuscated Perl code work?

How does this code work at all?
#!/usr/bin/perl
$i=4;$|=#f=map{("!"x$i++)."K$_^\x{0e}"}
"BQI!\\","BQI\\","BQI","BQ","B","";push
#f,reverse#f[1..5];#f=map{join"",undef,
map{chr(ord()-1)}split""}#f;{;$f=shift#
f;print$f;push#f,$f;select undef,undef,
undef,.25;redo;last;exit;print or die;}
Lets first put this through perltidy
$i = 5;
$| = #f = map { ("!" x $i++) . "9$_*\x{0e}" } ">>>E!)", ">>>E)", ">>>E", ">>>", ">>", ">", "";
push #f, reverse #f[ 1..5 ];
#f = map {
join "",
map { chr(ord() - 1) }
split //
} #f;
{
$f = shift #f;
print $f;
push #f, $f;
select undef, undef, undef, .25;
redo;
last;
exit;
print or die;
}
The first line is obvious.
The second line makes a list ">>>E!)", ">>>E)", ">>>E", ">>>", ">>", ">", "", and spaces them all to be equally long and appends an asterisk and a 'Shift Out' (the character after a carriage return).
The third line appends items 5 to 1 (in that order) to that list, , so it will be ">>>E!)", ">>>E)", ">>>E", ">>>", ">>", ">", "", ">", ">>", ">>>", ">>>E".
The map decrements the all characters by one, thus creating elements like 8===D ().
The second loop simply prints the elements in the list in a loop every 0.25 seconds. The carriage return causes them to overwrite each other, so that an animation is seen. The last couple of lines are never reached and thus bogus.
Data from the file is loaded into a program called a Perl interpreter. The interpreter parses the code and converts it to a series of "opcodes" -- a bytecode language that is sort of halfway between Perl code and the machine language that the code is running on. If there were no errors in the conversion process (called "compiling"), then the code is executed by another part of the Perl interpreter. During execution, the program may change various states of the machine, such as allocating, deallocating, reading, and writing to memory, or using the input/output and other features of the system.
(CW - More hardcore hackers than I are welcome to correct any errors or misconceptions and to add more information)
There's no magic going on here, just obfuscation. Let's take a high-level view. The first thing to notice is that later on, every character in strings is interpreted as if it were the previous character:
[1] map{chr(ord()-1)} ...
Thus, a string like "6qD" will result in "5rC" (the characters before '6', 'q', and 'D', respectively). The main point of interest is the array of strings near the beginning:
[2] ">>>E!)",">>>E)",">>>E",">>>",">>",">",""
This defines a sequence of "masks" that we will substitute later on, into this string:
[3] "9$_*\x{0e}"
They'll get inserted at the $_ point. The string \x{0e} represents a hex control character; notice that \x{0d}, the character just before it, is a carriage return. That's what'll get substituted into [3] when we do [1].
Before the [3] string is assembled, we prepend a number of ! equal to i to each element in [2]. Each successive element gets one more ! than the element before it. Notice that the character whose value is just before ! is a space .
The rest of the script iterates over each of the assembled array elements, which now look more like this:
[4] "!!!!!9>>>E!)\x{0e}", ---> " 8===D ("
"!!!!!!9>>>E)\x{0e}", ---> " 8===D("
"!!!!!!!9>>>E\x{0e}", ---> " 8===D"
"!!!!!!!!9>>>\x{0e}", ---> " 8==="
"!!!!!!!!!9>>\x{0e}", ---> " 8=="
"!!!!!!!!!!9>\x{0e}", ---> " 8="
"!!!!!!!!!!!9\x{0e}", ---> " 8"
Then the reverse operation appends the same elements in reverse, creating a loop.
At this point you should be able to see the pattern emerge that produces the animation. Now it's just a matter of moving through each step in the animation and back again, which is accomplished by the rest of the script. The timestep delay of each step is governed by the select statement:
[5] select undef, undef, undef, 0.25
which tells us to wait 250 milliseconds between each iteration. You can change this if you want to see it speed up or slow down.

In Perl, how do I process input as soon as it arrives, instead of waiting for newline?

I'd like to run a subcommand from Perl (or pipe it into a Perl script) and have the script process the command's output immediately, rather than waiting for a timeout, a newline, or a certain number of blocks. For example, let's say I want to surround each chunk of input with square brackets. When I run the script like this:
$ ( echo -n foo ; sleep 5 ; echo -n bar ; sleep 5; echo baz) | my_script.pl
I'd like the output to be this, with each line appearing five seconds after the previous one:
[foo]
[bar]
[baz]
How do I do that?
This works, but is really ugly:
#! /usr/bin/perl -w
use strict;
use Fcntl;
my $flags = '';
fcntl(STDIN, F_GETFL, $flags);
$flags |= O_NONBLOCK;
fcntl(STDIN, F_SETFL, $flags);
my $rin = '';
vec($rin,fileno(STDIN),1) = 1;
my $rout;
while (1) {
select($rout=$rin, undef, undef, undef);
last if eof();
my $buffer = '';
while (my $c = getc()) {
$buffer .= $c;
}
print "[$buffer]\n";
}
Is there a more elegant way to do it?
From perlfaq5: How can I read a single character from a file? From the keyboard?. You probably also want to read How can I tell whether there's a character waiting on a filehandle?. Poll the filehandle. If there is a character there, read it and reset a timer. If there is not character there, try again. If you've retried and passed a certain time, process the input.
After you read the characters, it's up to you to decide what to do with them. With all the flexibility of reading single characters comes the extra work of handling them.
Term::ReadKey can do this for you. In particular setting the ReadKey() mode to do the polling for you.
use Term::ReadKey;
$| = 1;
while( my $key = ReadKey(10) ) {
print $key;
}
If there's time inbetween each character, you might be able to detect the pauses.
Perl also does line input - if you don't use getc you should be able to add newlines to the end of foo, bar, etc and perl will give you each line.
If you can't add newlines, and you can't depend on a pause, then what exactly do you expect the system to do to tell perl that it's started a new command? As far as perl is concerned, there's a stdin pipe, it's eating data from it, and there's nothing in the stdin pipe to tell you when you are executing a new command.
You might consider the following instead:
$ echo "( echo -n foo ; sleep 5 ; echo -n bar ; sleep 5; echo baz)" | my_script.pl
or
$ my_script.pl$ "echo -n foo ; sleep 5 ; echo -n bar ; sleep 5; echo baz"
And modify your perl program to parse the input "command line" and execute each task, eating the stdout as needed.
-Adam
See How to change Open2 input buffering. (Basically, you have to make the other program think it's talking to a tty.)
You didn't mention how you are reading input in your Perl script, but you might want to look at the getc function:
$|++; # set autoflush on output
while ($c = getc(STDIN)) {
print $c;
}