Get value of autosplit delimiter? - perl

If I run a script with perl -Fsomething, is that something value saved anywhere in the Perl environment where the script can find it? I'd like to write a script that by default reuses the input delimiter (if it's a string and not a regular expression) as the output delimiter.

Looking at the source, I don't think the delimiter is saved anywhere. When you run
perl -F, -an
the lexer actually generates the code
LINE: while (<>) {our #F=split(q\0,\0);
and parses it. At this point, any information about the delimiter is lost.
Your best option is to split by hand:
perl -ne'BEGIN { $F="," } #F=split(/$F/); print join($F, #F)' foo.csv
or to pass the delimiter as an argument to your script:
F=,; perl -F$F -sane'print join($F, #F)' -- -F=$F foo.csv
or to pass the delimiter as an environment variable:
export F=,; perl -F$F -ane'print join($ENV{F}, #F)' foo.csv

As #ThisSuitIsBlackNot says it looks like the delimiter is not saved anywhere.
This is how the perl.c stores the -F parameter
case 'F':
PL_minus_a = TRUE;
PL_minus_F = TRUE;
PL_minus_n = TRUE;
PL_splitstr = ++s;
while (*s && !isSPACE(*s)) ++s;
PL_splitstr = savepvn(PL_splitstr, s - PL_splitstr);
return s;
And then the lexer generates the code
LINE: while (<>) {our #F=split(q\0,\0);
However this is of course compiled, and if you run it with B::Deparse you can see what is stored.
$ perl -MO=Deparse -F/e/ -e ''
LINE: while (defined($_ = <ARGV>)) {
our(#F) = split(/e/, $_, 0);
}
-e syntax OK
Being perl there is always a way, however ugly. (And this is some of the ugliest code I have written in a while):
use B::Deparse;
use Capture::Tiny qw/capture_stdout/;
BEGIN {
my $f_var;
}
unless ($f_var) {
$stdout = capture_stdout {
my $sub = B::Deparse::compile();
&{$sub}; # Have to capture stdout, since I won't bother to setup compile to return the text, instead of printing
};
my (undef, $split_line, undef) = split(/\n/, $stdout, 3);
($f_var) = $split_line =~ /our\(\#F\) = split\((.*)\, \$\_\, 0\);/;
print $f_var,"\n";
}
Output:
$ perl -Fe/\\\(\\[\\\<\\{\"e testy.pl
m#e/\(\[\<\{"e#
You could possible traverse the bytecode instead, since the start probably will be identical every time until you reach the pattern.

Related

Perl CLI code cannot do a string line appended

I'm trying to use a perl -npe one-liner to surround each line with =.
$ for i in {1..4}; { echo $i ;} |perl -npe '...'
=1=
=2=
=3=
=4=
The following is my first attempt. Note that the line feeds are in the incorrect position.
$ for i in {1..4}; { echo $i ;} |perl -npe '$_= "=".$_."=" '
=1
==2
==3
==4
=
I tried using chop to remove them line feeds and then re-add them in the correct position, but it didn't work.
$ for i in {1..4} ;{ echo $i ;} |perl -npe '$_= "=".chop($_)."=\n" '
=
=
=
=
=
=
=
=
Please solve it out, thanks much.
chop returned the removed character, not the remaining string. It modifies the variable in-place. So the following is the correct usage:
perl -npe'chop( $_ ); $_ = "=$_=\n"'
But we can improve this.
It's safer to use chomp instead of chop to remove trailing line feeds.
-n is implied by -p, and it's customary to leave it out when -p is used.
chomp and chop modify $_ by default, so we don't need to explicitly pass $_.
perl -pe'chomp; $_ = "=$_=\n"'
Finally, we can get the same exact behaviour out of -l.
perl -ple'$_ = "=$_="'

Search and replace in Perl for particular word

I have a huge file which consists of similar lines below , with different clocks:
cmd -quiet [get_ports p1] ref_clocks "cudtclk_sp cudtclk"
cmd -quiet [get_ports p2] clock "cu2xdtclk_sp cu2xdtclk"
And I need to replace cudtclk with some other name like cdtclk whenever I have ref_clocks in my file, globally.
I have written following code but it doesn't seem to be working.
#!/usr/bin/perl
use strict;
use warnings;
sub clock_change
{       # Get the subroutine's argument.
my $arg = shift;
# Hash of stuff we want to replace.
my %replace = (
"cudtclk" => "cdtclk",
);
# See if there's a replacement for the given text.
my $text = $replace{$arg};
if(defined($text)) {
return $text;
}
return $arg;
}
open PAR, "<file name>";
while(<PAR>) {
$_ =~ s/\S+\s\S+\s\S+\s\S+\sref_clocks\s+(\S+\s+\S+)/clock_change($1)/eig;
print $_;   ##print it to some file later.
}
"And I need to replace cudtclk with some other name like cdtclk"
perl -pe 's/\bcudtclk\b/cdtclk/' thefile > newfile
"whenever I have ref_clocks"
perl -pe 's/\bcudtclk\b/cdtclk/ if /\bref_clocks\b/' thefile > newfile
Alternatively:
# saves original file as file.bak
perl -i.bak -pe 's/\bcudtclk\b/cdtclk/ if /\bref_clocks\b/' file
Tighten to suit your data, as necessary.
Although the substitution seems like unnecessarily complex, you can fix it with something similar to:
$_ =~ s/(ref_clocks\s+")([^_]+)_sp(\s+)\2/
$1.clock_change($2)."_sp$3".clock_change($2)/eig;

Multiple text parsing and writing using the while statement, the diamond operator <> and $ARGV variable in Perl

I have some text files, inside a directory and i want to parse their content and write it to a file. So far the code i am using is this:
#!/usr/bin/perl
#The while loop repeats the execution of a block as long as a certain condition is evaluated true
use strict; # Always!
use warnings; # Always!
my $header = 1; # Flag to tell us to print the header
while (<*.txt>) { # read a line from a file
if ($header) {
# This is the first line, print the name of the file
**print "========= $ARGV ========\n";**
# reset the flag to a false value
$header = undef;
}
# Print out what we just read in
print;
}
continue { # This happens before the next iteration of the loop
# Check if we finished the previous file
$header = 1 if eof;
}
When i run this script i am only getting the headers of the files, plus a compiled.txt entry.
I also receive the following message in cmd : use of uninitialized $ARGV in concatenation <.> or string at concat.pl line 12
So i guess i am doing something wrong and $ARGV isn't used at all. Plus instead of $header i should use something else in order to retrieve the text.
Need some assistance!
<*.txt> does not read a line from a file, even if you say so in a comment. It runs
glob '*.txt'
i.e. the while loop iterates over the file names, not over their contents. Use empty <> to iterate over all the files.
BTW, instead of $header = undef, you can use undef $header.
As I understand you want to print a header with the filename just before the first line, and concatenate them all to a new one. Then a one-liner could be enough for the task.
It checks first line with variable $. and closes the filehandle to reset its value between different input files:
perl -pe 'printf qq|=== %s ===\n|, $ARGV if $. == 1; close ARGV if eof' *.txt
An example in my machine yields:
=== file1.txt ===
one
=== file2.txt ===
one
two

what is "-n" in the script?

saw the script (see below) but could not find more info about "-n".
my $numeric =0;
my $input = shift;
if ($input eq "-n") {
$numeric =1;
$input = shift;
}
my $output = shift;
open INPUT, $input or die $!;
open OUTPUT, ">$output" or die $!;
my #file = <INPUT>;
if ($numeric) {
#file = sort { $a <=> $b } #file;
} else {
#file = sort #file;
}
print OUTPUT #file;
The text explaining the script says the following "If the first thing we see on the command line after our program's name is the string -n, then we are doing a numeric sort."
Google search does not seem to recognize most "non-alphanumeric" symbols, so "-n" search yields nothing. The only other place I saw "-n"is in learning perl, where it says the following "the converted sed script can operate either with or without -n option". Not even sure if this is the same "-n" as in the script. Any idea where I can find out more info about the -n (although it may simply means a numeric string ?? nothing else more)
The -n used by this script is entirely unrelated to the -n flag used by perl. In other words, this:
perl -n script.pl
Is completely different from this:
perl script.pl -n
What you have is the second case. Take a look at the documentation for shift:
Shifts the first value of the array off and returns it, shortening the
array by 1 and moving everything down. If there are no elements in the
array, returns the undefined value. If ARRAY is omitted, shifts the #_
array within the lexical scope of subroutines and formats, and the
#ARGV array outside a subroutine and also within the lexical scopes
established by the eval STRING , BEGIN {} , INIT {} , CHECK {} ,
UNITCHECK {} , and END {} constructs.
That's a mouthfull, but what it's saying is that if we're not in a subroutine, and shift appears by itself, it's going to grab the first element of #ARGV. What's #ARGV? Let's look in perlvar, where all those weird variables are documented:
The array #ARGV contains the command-line arguments intended for the
script.
Note that those are the arguments for the script, not for perl. So if somebody executes your script with perl script.pl -n, then we can expect $ARGV[0] to be the string -n.
Looking at your code now, it's obvious what's going on:
my $input = shift;
if ($input eq "-n") {
$numeric =1;
$input = shift;
}
They use shift without an argument and outside a subroutine to grab the first element of #ARGV. If that's -n, the variable $numeric is set to 1. That variable controls how the script behaves. (The script then goes on to get the names of the input and output files out of #ARGV as well.)
Its a command line argument for this script itself. If the user invokes it with the name of the script followed by "-n" then that will tell the script how to behave.

How to put 'perl -pne' functionality in a perl script

So at the command line I can conveniently do something like this:
perl -pne 's/from/to/' in > out
And if I need to repeat this and/or I have several other perl -pne transformations, I can put them in, say, a .bat file in Windows. That's a rather roundabout way of doing it, of course. I should just write one perl script that has all those regex transformations.
So how do you write it? If I have a shell script containing these lines:
perl -pne 's/from1/to1/' in > temp
perl -pne 's/from2/to2/' -i temp
perl -pne 's/from3/to3/' -i temp
perl -pne 's/from4/to4/' -i temp
perl -pne 's/from5/to5/' temp > out
How can I just put these all into one perl script?
-e accepts arbitrary complex program. So just join your substitution operations.
perl -pe 's/from1/to1/; s/from2/to2/; s/from3/to3/; s/from4/to4/; s/from5/to5/' in > out
If you really want a Perl program that handles input and looping explicitely, deparse the one-liner to see the generated code and work from here.
> perl -MO=Deparse -pe 's/from1/to1/; s/from2/to2/; s/from3/to3/; s/from4/to4/; s/from5/to5/'
LINE: while (defined($_ = <ARGV>)) {
s/from1/to1/;
s/from2/to2/;
s/from3/to3/;
s/from4/to4/;
s/from5/to5/;
}
continue {
print $_;
}
-e syntax OK
Related answer to the question you didn't quite ask: the perl special variable $^I, used together with #ARGV, gives the in-place editing behavior of -i. As with the -p option, Deparse will show the generated code:
perl -MO=Deparse -pi.bak -le 's/foo/bar/'
BEGIN { $^I = ".bak"; }
BEGIN { $/ = "\n"; $\ = "\n"; }
LINE: while (defined($_ = <ARGV>)) {
chomp $_;
s/foo/bar/;
}
continue {
print $_;
}