Delete the last line of a file using perl - perl

sed '$d' $file;
Using this command doesn't seem to work, as $ is a reserved symbol in Perl.

Don't know why are you using sed into Perl. Perl itself have standard module to delete last line from a file.
Use the standard (as of v5.8) Tie::File module and delete the last element from the tied array:
use Tie::File;
tie #lines, Tie::File, $file or die "can't update $file: $!";
delete $lines[-1];

Last line only
The closest syntax seem to be:
perl -ne 'print unless eof()'
This will act like sed, ie: without the requirement of reading the whole file into memory and could work with FIFO like STDIN.
See:
perl -ne 'print unless eof()' < <(seq 1 3)
1
2
or maybe:
perl -pe '$_=undef if eof()' < <(seq 1 3)
1
2
First and last lines
perl -pe '
BEGIN {
chomp(my $first= <>);
print "Something special with $first\n";
};
do {
chomp;
print "Other speciality with $_\n";
undef $_;
} if eof();
' < <(seq 1 5)
will render:
Something special with 1
2
3
4
Other speciality with 5
Shortest: first and last line:
perl -pe 's/^/Something... / if$.==1||eof' < <(seq 1 5)
will render:
Something... 1
2
3
4
Something... 5
Try this:
perl -pe 'BEGIN{$s=join"|",qw|1 3 7 21|;};
if ($.=~/^($s)$/||eof){s/^/---/}else{s/$/.../}' < <(seq 1 22)
... something like sed command:
sed '1ba;3ba;7ba;21ba;$ba;s/$/.../;bb;:a;s/^/---/;:b' < <(seq 1 22)
In a script file:
#!/usr/bin/perl -w
use strict;
sub something {
chomp;
print "Something special with $_.\n";
}
$_=<>;
something;
while (<>) {
if (eof) { something; }
else { print; };
}
will give:
/tmp/script.pl < <(seq 1 5)
Something special with 1.
2
3
4
Something special with 5.

Hope you are trying to execute the 'sed' command from the middle of a perl script. I would recommend not to use this approach because it will work only in non-windows systems. Below is a perl'ish approach in which you can process first and last lines only rather than spending effort to delete file contents. Thank you.
Assuming "myfile.txt" as the input file:
open (FH, "<", "myfile.txt") or die "Unable to open \"myfile.txt\": $! \n";
$i = 1;
while(<FH>){
next if ($i++ == 1 or eof);
# Process other lines
print "\nProcessing Line: $_";
}
close (FH);
print "\n";
1;
myfile.txt -
# First line
This is the beginning of comment free file.
Hope this is also the line to get processed!
# last line
Result -
Processing Line: This is the beginning of comment free file.
Processing Line: Hope this is also the line to get processed!

Related

Detect first or second file in a one-liner

In AWK, it is common to see this kind of structure for a script that runs on two files:
awk 'NR==FNR { print "first file"; next } { print "second file" }' file1 file2
Which uses the fact that there are two variables defined: FNR, which is the line number in the current file and NR which is the global count (equivalent to Perl's $.).
Is there something similar to this in Perl? I suppose that I could maybe use eof and a counter variable:
perl -nE 'if (! $fn) { say "first file" } else { say "second file" } ++$fn if eof' file1 file2
This works but it feels like I might be missing something.
To provide some context, I wrote this answer in which I manually define a hash but instead, I would like to populate the hash from the values in the first file, then do the substitutions on the second file. I suspect that there is a neat, idiomatic way of doing this in Perl.
Unfortunately, perl doesn't have a similar NR==FNR construct to differentiate between two files. What you can do is use the BEGIN block to process one file and main body to process the other.
For example, to process a file with the following:
map.txt
a=apple
b=ball
c=cat
d=dog
alpha.txt
f
a
b
d
You can do:
perl -lne'
BEGIN {
$x = pop;
%h = map { chomp; ($k,$v) = split /=/; $k => $v } <>;
#ARGV = $x
}
print join ":", $_, $h{$_} //= "Not Found"
' map.txt alpha.txt
f:Not Found
a:apple
b:ball
d:dog
Update:
I gave a pretty simple example, and now when I look at that, I can only say TIMTOWDI since you can do:
perl -F'=' -lane'
if (#F == 2) { $h{$F[0]} = $F[1]; next }
print join ":", $_, $h{$_} //= "Not Found"
' map.txt alpha.txt
f:Not Found
a:apple
b:ball
d:dog
However, I can say for sure, there is no NR==FNR construct for perl and you can probably process them in various different ways based on the files.
It looks like what you're aiming for is to use the same loop for reading both files, and have a conditional inside the loop that chooses what to do with the data. I would avoid that idea because you are hiding what two distinct processes in the same stretch of code, making it less than clear what is going on.
But, in the case of just two files, you could compare the current file with the first element of #ARGV, like this
perl -nE 'if ($ARGV eq $ARGV[0]) { say "first file" } else { say "second file" }' file1 file2
Forgetting about one-line programs, which I hate with a passion, I would just explicitly open $ARGV[0] and $ARGV[1]. Perhaps naming them like this
use strict;
use warnings;
use 5.010;
use autodie;
my ($definitions, $data) = #ARGV;
open my $fh, '<', $definitions;
while (<$fh>) {
# Build hash
}
open $fh, '<', $data;
while (<$fh>) {
# Process file
}
But if you want to avail yourself of the automatic opening facilities then you can mess with #ARGV like this
use strict;
use warnings;
my ($definitions, $data) = #ARGV;
#ARGV = ($definitions);
while (<>) {
# Build hash
}
#ARGV = ($data);
while (<>) {
# Process file
}
You can also create your own $fnr and compare to $..
Given:
var='first line
second line'
echo "$var" >f1
echo "$var" >f2
echo "$var" >f3
You can create a pseudo FNR by setting a variable in the BEGIN block and resetting at each eof:
perl -lnE 'BEGIN{$fnr=1;}
if ($fnr==$.) {
say "first file: $ARGV, $fnr, $. $_";
}
else {
say "$ARGV, $fnr, $. $_";
}
eof ? $fnr=1 : $fnr++;' f{1..3}
Prints:
first file: f1, 1, 1 first line
first file: f1, 2, 2 second line
f2, 1, 3 first line
f2, 2, 4 second line
f3, 1, 5 first line
f3, 2, 6 second line
Definitely not as elegant as awk but it works.
Note that Ruby has support for FNR==NR type logic.

Extracting parts of files by separators

I have a file that is in the following format:
Preamble
---------------------
Section 1
...
---------------------
---------------------
Section 2
...
---------------------
---------------------
Section 3
...
---------------------
Afterwords
And I want to extract each section by the separator so that I'll have a result in:
file0:
Section 1
...
file1:
Section 2
...
file2:
Section 3
...
...
Is there a simple way to do this? Thanks.
[Update] Using chomp and $_ makes this even shorter.
This should do it:
If your input record separator is a sequence of 21 -'s, this is easy with perl -ne:
perl -ne 'BEGIN{ $/=("-"x21)."\n"; $i=0; }
do { open F, ">file".($i++);
chomp;
print F;
close F;
} if /^Section/' yourfile.txt
should work, and create files file0.. fileN.
Explanation
Easier to explain as a stand-alone Perl-script perhaps?
$/=("-"x21)."\n"; # Set the input-record-separator to "-" x 21 times
my $i = 0; # output file number
open IN, "<yourfile.txt" or die "$!";
while (<IN>) { # Each "record" will be available as $_
do { open F, ">file".($i++);
chomp; # remove the trailing "---..."
print F; # write the record to the file
close F; #
} if /^Section/ # do all this only it this is a Section
}
Perl's awk lineage was useful here, so let's show an awk version for comparion:
awk 'BEGIN{RS="\n-+\n";i=0}
/Section/ {chomp; print > "file_"(i++)".txt"
}' yourfile.txt
Not too bad compared to the perl version, it's actually shorter. The $/ in Perl is the RS variable in awk. Awk has an upper hand here: RS may be a regular expression!
You can do with shell too :
#!/bin/bash
i=0
while read line ; do
#If the line contain "Section " followed by a
#digit the next lines have to be printed
echo "$line"|egrep -q "Section [0-9]+"
if [ $? -eq 0 ] ; then
toprint=true
i=$(($i + 1))
touch file$i
fi
#If the line contain "--------------------"
#the next lines doesn't have to be printed
echo "$line"|egrep -q "[-]{20}"
if [ $? -eq 0 ] ; then
toprint=false
fi
#Print the line if needed
if $toprint ; then
echo $line >> file$i
fi
done < sections.txt
Here's what you're looking for:
awk '/^-{21}$/ { f++; next } f%2!=0 { print > "file" (f-1)/2 ".txt" }' file
Results:
Contents of file0.txt:
Section 1
...
Contents of file1.txt:
Section 2
...
Contents of file2.txt:
Section 3
...
As you can see the above filenames are 'zero' indexed. If you'd like filenames 'one' indexed, simply change (f-1)/2 to (f+1)/2. HTH.
Given your file's format, here's one option:
use strict;
use warnings;
my $fh;
my $sep = '-' x 21;
while (<>) {
if (/^Section\s+(\d+)/) {
open $fh, '>', 'file' . ( $1 - 1 ) . '.txt' or die $!;
}
print $fh $_ if defined $fh and !/^$sep/;
}
On your data, creates file0.txt .. file2.txt with file0.txt containing:
Section 1
...

How can I print specific lines from a file in Unix?

I want to print certain lines from a text file in Unix. The line numbers to be printed are listed in another text file (one on each line).
Is there a quick way to do this with Perl or a shell script?
Assuming the line numbers to be printed are sorted.
open my $fh, '<', 'line_numbers' or die $!;
my #ln = <$fh>;
open my $tx, '<', 'text_file' or die $!;
foreach my $ln (#ln) {
my $line;
do {
$line = <$tx>;
} until $. == $ln and defined $line;
print $line if defined $line;
}
$ cat numbers
1
4
6
$ cat file
one
two
three
four
five
six
seven
$ awk 'FNR==NR{num[$1];next}(FNR in num)' numbers file
one
four
six
You can avoid the limitations of the some of the other answers (requirements for sorted lines), simply by using eof within the context of a basic while(<>) block. That will tell you when you've stopped reading line numbers and started reading data. Note that you need to reset $. when the switch occurs.
# Usage: perl script.pl LINE_NUMS_FILE DATA_FILE
use strict;
use warnings;
my %keep;
my $reading_line_nums = 1;
while (<>){
if ($reading_line_nums){
chomp;
$keep{$_} = 1;
$reading_line_nums = $. = 0 if eof;
}
else {
print if exists $keep{$.};
}
}
cat -n foo | join foo2 - | cut -d" " -f2-
where foo is your file with lines to print and foo2 is your file of line numbers
Here is a way to do this in Perl without slurping anything so that the memory footprint of the program is independent of the sizes of both files (it does assume that the line numbers to be printed are sorted):
#!/usr/bin/perl
use strict; use warnings;
use autodie;
#ARGV == 2
or die "Supply src_file and filter_file as arguments\n";
my ($src_file, $filter_file) = #ARGV;
open my $src_h, '<', $src_file;
open my $filter_h, '<', $filter_file;
my $to_print = <$filter_h>;
while ( my $src_line = <$src_h> ) {
last unless defined $to_print;
if ( $. == $to_print ) {
print $src_line;
$to_print = <$filter_h>;
}
}
close $filter_h;
close $src_h;
Generate the source file:
C:\> perl -le "print for aa .. zz" > src
Generate the filter file:
C:\> perl -le "print for grep { rand > 0.75 } 1 .. 52" > filter
C:\> cat filter
4
6
10
12
13
19
23
24
28
44
49
50
Output:
C:\> f src filter
ad
af
aj
al
am
as
aw
ax
bb
br
bw
bx
To deal with an unsorted filter file, you can modified the while loop:
while ( my $src_line = <$src_h> ) {
last unless defined $to_print;
if ( $. > $to_print ) {
seek $src_h, 0, 0;
$. = 0;
}
if ( $. == $to_print ) {
print $src_line;
$to_print = <$filter_h>;
}
}
This would waste a lot of time if the contents of the filter file are fairly random because it would keep rewinding to the beginning of the source file. In that case, I would recommend using Tie::File.
I wouldn't do it this way with large files, but (untested):
open(my $fh1, "<", "line_number_file.txt") or die "Err: $!";
chomp(my #line_numbers = <$fh1>);
$_-- for #line_numbers;
close $fh1;
open(my $fh2, "<", "text_file.txt") or die "Err: $!";
my #lines = <$fh2>;
print #lines[#line_numbers];
close $fh2;
I'd do it like this:
#!/bin/bash
numbersfile=numbers
datafile=data
while read lineno < $numbersfile; do
sed -n "${lineno}p" datafile
done
Downside to my approach is that it will spawn a lot of processes so it will be slower than other options. It's infinitely more readable though.
This is a short solution using bash and sed
sed -n -e "$(cat num |sed 's/$/p/')" file
Where num is the file of numbers and file is the input file ( Tested on OS/X Snow leopard)
$ cat num
1
3
5
$ cat file
Line One
Line Two
Line Three
Line Four
Line Five
$ sed -n -e "$(cat num |sed 's/$/p/')" file
Line One
Line Three
Line Five
$ cat input
every
good
bird
does
fly
$ cat lines
2
4
$ perl -ne 'BEGIN{($a,$b) = `cat lines`} print if $.==$a .. $.==$b' input
good
bird
does
If that's too much for a one-liner, use
#! /usr/bin/perl
use warnings;
use strict;
sub start_stop {
my($path) = #_;
open my $fh, "<", $path
or die "$0: open $path: $!";
local $/;
return ($1,$2) if <$fh> =~ /\s*(\d+)\s*(\d+)/;
die "$0: $path: could not find start and stop line numbers";
}
my($start,$stop) = start_stop "lines";
while (<>) {
print if $. == $start .. $. == $stop;
}
Perl's magic open allows for creative possibilities such as
$ ./lines-between 'tac lines-between|'
print if $. == $start .. $. == $stop;
while (<>) {
Here is a way to do this using Tie::File:
#!/usr/bin/perl
use strict; use warnings;
use autodie;
use Tie::File;
#ARGV == 2
or die "Supply src_file and filter_file as arguments\n";
my ($src_file, $filter_file) = #ARGV;
tie my #source, 'Tie::File', $src_file, autochomp => 0
or die "Cannot tie source '$src_file': $!";
open my $filter_h, '<', $filter_file;
while ( my $to_print = <$filter_h> ) {
print $source[$to_print - 1];
}
close $filter_h;
untie #source;

How can I print only certain fields in a space separated file?

I have a file containing the following content 1000 line in the following format:
abc def ghi gkl
How can I write a Perl script to print only the first and the third fields?
abc ghi
perl -lane 'print "#F[0,2]"' file
If no answer is good for you yet, I'll try to get the bounty ;-)
#!/usr/bin/perl
# Lines beginning with a hash (#) denote optional comments,
# except the first line, which is required,
# see http://en.wikipedia.org/wiki/Shebang_(Unix)
use strict; # http://perldoc.perl.org/strict.html
use warnings; # http://perldoc.perl.org/warnings.html
# http://perldoc.perl.org/perlsyn.html#Compound-Statements
# http://perldoc.perl.org/functions/defined.html
# http://perldoc.perl.org/functions/my.html
# http://perldoc.perl.org/perldata.html
# http://perldoc.perl.org/perlop.html#I%2fO-Operators
while (defined(my $line = <>)) {
# http://perldoc.perl.org/functions/split.html
my #chunks = split ' ', $line;
# http://perldoc.perl.org/functions/print.html
# http://perldoc.perl.org/perlop.html#Quote-Like-Operators
print "$chunks[0] $chunks[2]\n";
}
To run this script, given that its name is script.pl, invoke it as
perl script.pl FILE
where FILE is the file that you want to parse. See also http://perldoc.perl.org/perlrun.html. Good luck! ;-)
That's really kind of a waste for something as powerful as perl, since you can do the same thing in one trivial line of awk.
awk '{ print $1 $3 }'
while ( <> ) {
my #fields = split;
print "#fields[0,2]\n";
}
and just for variety, on Windows:
C:\Temp> perl -pale "$_=qq{#F[0,2]}"
and on Unix
$ perl -pale '$_="#F[0,2]"'
As perl one-liner:
perl -ane 'print "#F[0,2]\n"' file
Or as executable script:
#!/usr/bin/perl
use strict;
use warnings;
open my $fh, '<', 'file' or die "Can't open file: $!\n";
while (<$fh>) {
my #fields = split;
print "#fields[0,2]\n";
}
Execute the script like this:
perl script.pl
or
chmod 755 script.pl
./script.pl
I'm sure I shouldn't get the bounty since the question asks for the result to be given in perl, but anyway:
In bash/ksh/ash/etc:
cut -d " " -f 1,3 "file"
In Windows/DOS:
for /f "tokens=1-4 delims= " %i in (file) do (echo %i %k)
Advantages: like others said, no need to learn Pearl, Awk, nothing, just knowing some tools. The result of both calls can be saved to the disk by using the ">" and the ">>" operator.
while(<>){
chomp;
#s = split ;
print "$s[0] $s[2]\n";
}
please start to go through the documentation as well
#!/usr/bin/env perl
open my$F, "<", "file" or die;
print join(" ",(split)[0,2])."\n" while(<$F>);
close $F
One easy way is:
(split)[0,2]
Example:
$_ = 'abc def ghi gkl';
print( (split)[0,2] , "\n");
print( join(" ", (split)[0,2] ),"\n");
Command line:
perl -e '$_="abc def ghi gkl";print(join(" ",(split)[0,2]),"\n")'

How can I read all of the lines between two lines in a file, using Perl?

I have one file where the contents looks like:
pch
rch
channel
cap
nch
kappa
.
.
.
kary
ban
....
Now I want to read my file from nch to kary and copying those lines only in some other file. How can I do this in Perl?
If I understand your question correctly, this is pretty simple.
#!perl -w
use strict;
use autodie;
open my $in,'<',"File1.txt";
open my $out,'>',"File2.txt";
while(<$in>){
print $out $_ if /^nch/ .. /^kary/;
}
From perlfaq6's answer to How can I pull out lines between two patterns that are themselves on different lines?
You can use Perl's somewhat exotic .. operator (documented in perlop):
perl -ne 'print if /START/ .. /END/' file1 file2 ...
If you wanted text and not lines, you would use
perl -0777 -ne 'print "$1\n" while /START(.*?)END/gs' file1 file2 ...
But if you want nested occurrences of START through END, you'll run up against the problem described in the question in this section on matching balanced text.
Here's another example of using ..:
while (<>) {
$in_header = 1 .. /^$/;
$in_body = /^$/ .. eof;
# now choose between them
} continue {
$. = 0 if eof; # fix $.
}
You could use this in 'sed':
sed -n /nch/,/kary/p $file
You could use 's2p' to convert this to Perl.
You could also write pure Perl:
while (<>)
{
next unless /nch/;
print;
while (<>)
{
print;
last if /kary/;
}
}
Strictly, both these solutions will print each set of lines from 'nch' to 'kary'; if 'nch' appears more than once, it will print more than one chunk of code. It is easy to fix that, especially in the pure Perl ('sed' solution left as an exercise for the reader).
OUTER:
while (<>)
{
next unless /nch/;
print;
while (<>)
{
print;
last OUTER if /kary/;
}
}
Also, the solutions look for 'nch' and 'kary' as part of the line - not for the whole line. If you need them to match the whole line, use '/^nch$/' etc as the regex.
Something like:
$filter = 0;
while (<>) {
chomp;
$filter = 1 if (! $filter && /^nch$/);
$filter = 0 if ($filter && /^ban$/);
print($_, "\n") if ($filter);
}
should work.
if you only want to read one block, in gawk
gawk '/kary/&&f{print;exit}/nch/{f=1}f' file
in Perl
perl -lne '$f && /kary/ && print && exit;$f=1 if/nch/; $f && print' file
or
while (<>) {
chomp;
if ($f && /kary/) {
print $_."\n";
last;
}
if (/nch/) { $f = 1; }
print $_ ."\n" if $f;
}