Perl's autosplit function with in place editing - perl

I just had a task in where I needed to replace each 3rd value in a tabulator separated file with a fixed value. I guess it can be done in Perl on a Unix shell like so
$perl -a -n -i -F'/\t/' -e '$F[2]="THE FIXED VALUE";print join "\t", #F' bla.txt
I just wanted to know if this is a "correct" way, or if there is a better (for a currently lacking definition of better) to do it?

I think your one-liner is reasonable and readable. There are many more ways to do it. I would stack the perlrun options and save a few keystrokes:
perl -F'\t' -i -ape'$F[2]="THE FIXED VALUE"; $_ = join "\t", #F' bla.txt
A shame that $, does not get populated with the argument of -F, so there's still a piece of repetition.

Related

Perl `uc` function oneliner -p/-n difference?

It works as intended:
perl -ne "print uc" /etc/passwd
But following isn't (it just prints in original case":
perl -pe uc /etc/passwd
I don't understand what's wrong with it.
thanks.
You're doing different things. So it's not surprising that you get different results.
In the first example, you take the value of $_, pass it to uc and print the results (which is an upper case version of the original text).
In the second example, you take the value of $_, pass it to uc and print the value in $_. But you've done nothing to update $_ so you get the unaltered value. The fix (as you've already noted in a comment) is to update $_ with the value that is returned by uc.
perl -pe '$_ = uc' /etc/passwd

Function name inside parentheses in Perl one liner

I'm working on a Perl one liner tutorial and there are one liners like this:
ls -lAF | perl -e 'while (<>) {next if /^[dt]/; print +(split)[8] . " size: " . +(split)[4] . "\n"}'
You see the function name split has been inside parentheses. Documentation about this use of functions is hard to find on Google so I couldn't find any information on it. Could somebody explain it? Thank you.
It probably doesn't help that the use of split is defaulting everything - it's splitting $_ by spaces and returning a list of values.
The (...)[8] is called a list slice, and it filters out all but the 9th value returned by split. The preceding plus is there to prevent Perl from misparsing the brackets as being part of a function call. Which also means you don't need it on the second instance.
So print +(split)[8]; is basically a very succinct way of writing
my #results=split(/ /,$_);
print $results[8];
The example you've included is performing the split twice so it might be more efficient to do the more verbose version as you can get $results[4] from the above without any extra effort.
Or because you can put a list of indexes inside the [], you could do the split once and use printf to format the output like this
printf "%s size: %s\n", (split)[8,4];
In my opinion you should be avoiding this author's advice, both for the reasons laid out in my comments on your question, and because they don't appear to know their topic at all well.
The original "one-liner" was this
ls -lAF | perl -e 'while (<>) {next if /^[dt]/; print +(split)[8] . " size: " . +(split)[4] . "\n"}'
This could be written much more succinctly by using the -n and -a options, giving this
ls -lAF | perl -wane 'print $F[8] size: $F[4]\n" unless /^[dt]/'
Even without the "luxury" of these options you could write
ls -lAF | perl -e '/^[dt]/ or printf "%s size: %s\n", (split)[8,4] while <>'
I recommend that you go and read the Camel Book several times over the next few years. That is the best way to learn the language that I have found.
Most installations of Perl include a full set of documentation, accessible using the perldoc command.
You need to read the Slices section of perldoc perldata which makes very clear this use of slicing.

Prevent perl from printing a newline

I have this simple command:
printf TEST | perl -nle 'print lc'
Which prints:
test
​
I want:
test
...without the newline. I tried perl's printf but that removes all newlines, and I'd like to keep existing one's in place. Plus, that wouldn't work for my second example that doesn't even use print in it:
printf "BOB'S BIG BOY" | perl -ple 's/([^\s.,-]+)/\u\L$1/g'
Which prints:
Bob's Big Boy
​
...with that annoying newline as well. I'm hoping for a magical switch like --no-newline but I'm guessing it's something more involved.
EDIT: I've changed my use of echo in the examples to printf to clarify the problem. A few commenters were correct in stating that my problem wouldn't actually be fixed as it was written.
You simply have to remove the -l switch, see perldoc perlrun
-l[octnum]
enables automatic line-ending processing. It has two separate
effects. First, it automatically chomps $/ (the input record
separator) when used with -n or -p. Second, it assigns $\ (the output
record separator) to have the value of octnum so that any print
statements will have that separator added back on. If octnum is
omitted, sets $\ to the current value of $/.

How do I best pass arguments to a Perl one-liner?

I have a file, someFile, like this:
$cat someFile
hdisk1 active
hdisk2 active
I use this shell script to check:
$cat a.sh
#!/usr/bin/ksh
for d in 1 2
do
grep -q "hdisk$d" someFile && echo "$d : ok"
done
I am trying to convert it to Perl:
$cat b.sh
#!/usr/bin/ksh
export d
for d in 1 2
do
cat someFile | perl -lane 'BEGIN{$d=$ENV{'d'};} print "$d: OK" if /hdisk$d\s+/'
done
I export the variable d in the shell script and get the value using %ENV in Perl. Is there a better way of passing this value to the Perl one-liner?
You can enable rudimentary command line argument with the "s" switch. A variable gets defined for each argument starting with a dash. The -- tells where your command line arguments start.
for d in 1 2 ; do
cat someFile | perl -slane ' print "$someParameter: OK" if /hdisk$someParameter\s+/' -- -someParameter=$d;
done
See: perlrun
Sometimes breaking the Perl enclosure is a good trick for these one-liners:
for d in 1 2 ; do cat kk2 | perl -lne ' print "'"${d}"': OK" if /hdisk'"${d}"'\s+/';done
Pass it on the command line, and it will be available in #ARGV:
for d in 1 2
do
perl -lne 'BEGIN {$d=shift} print "$d: OK" if /hdisk$d\s+/' $d someFile
done
Note that the shift operator in this context removes the first element of #ARGV, which is $d in this case.
Combining some of the earlier suggestions and adding my own sugar to it, I'd do it this way:
perl -se '/hdisk([$d])/ && print "$1: ok\n" for <>' -- -d='[value]' [file]
[value] can be a number (i.e. 2), a range (i.e. 2-4), a list of different numbers (i.e. 2|3|4) (or almost anything else, that's a valid pattern) or even a bash variable containing one of those, example:
d='2-3'
perl -se '/hdisk([$d])/ && print "$1: ok\n" for <>' -- -d=$d someFile
and [file] is your filename (that is, someFile).
If you are having trouble writing a one-liner, maybe it is a bit hard for one line (just my opinion). I would agree with #FM's suggestion and do the whole thing in Perl. Read the whole file in and then test it:
use strict;
local $/ = '' ; # Read in the whole file
my $file = <> ;
for my $d ( 1 .. 2 )
{
print "$d: OK\n" if $file =~ /hdisk$d\s+/
}
You could do it looping, but that would be longer. Of course it somewhat depends on the size of the file.
Note that all the Perl examples so far will print a message for each match - can you be sure there are no duplicates?
My solution is a little different. I came to your question with a Google search the title of your question, but I'm trying to execute something different. Here it is in case it helps someone:
FYI, I was using tcsh on Solaris.
I had the following one-liner:
perl -e 'use POSIX qw(strftime); print strftime("%Y-%m-%d", localtime(time()-3600*24*2));'
which outputs the value:
2013-05-06
I was trying to place this into a shell script so I could create a file with a date in the filename, of X numbers of days in the past. I tried:
set dateVariable=`perl -e 'use POSIX qw(strftime); print strftime("%Y-%m-%d", localtime(time()-3600*24*$numberOfDaysPrior));'`
But this didn't work due to variable substitution. I had to mess around with the quoting, to get it to interpret it properly. I tried enclosing the whole lot in double quotes, but this made the Perl command not syntactically correct, as it messed with the double quotes around date format. I finished up with:
set dateVariable=`perl -e "use POSIX qw(strftime); print strftime('%Y-%m-%d', localtime(time()-3600*24*$numberOfDaysPrior));"`
Which worked great for me, without having to resort to any fancy variable exporting.
I realise this doesn't exactly answer your specific question, but it answered the title and might help someone else!
That looks good, but I'd use:
for d in $(seq 1 2); do perl -nle 'print "hdisk$ENV{d} OK" if $_ =~ /hdisk$ENV{d}/' someFile; done
It's already written on the top in one long paragraph but I am also writing for lazy developers who don't read those lines.
Double quotes and single quote has big different meaning for the bash.
So please take care
Doesn't WORK perl '$VAR' $FILEPATH
WORKS perl "$VAR" $FILEPATH

Remove line if field is duplicate

Looking for an awk (or sed) one-liner to remove lines from the output if the first field is a duplicate.
An example for removing duplicate lines I've seen is:
awk 'a !~ $0; {a=$0}'
Tried using it for a basis with no luck (I thought changing the $0's to $1's would do the trick, but didn't seem to work).
awk '{ if (a[$1]++ == 0) print $0; }' "$#"
This is a standard (very simple) use for associative arrays.
this is how to remove duplicates
awk '!_[$1]++' file
If you're open to using Perl:
perl -ane 'print if ! $a{$F[0]}++' file
-a autosplits the line into the #F array, which is indexed starting at 0
The %a hash remembers if the first field has already been seen
This related solution assumes your field separator is a comma, rather than whitespace
perl -F, -ane 'print if ! $a{$F[0]}++' file
it print the unique as well as single value of the duplicates
awk '!a[$1]++' file_name