Print each line of a file - perl

I have a file test.txt that reads as follows:
one
two
three
Now, I want to print each line of this file as follows:
.one (one)
.two (two)
.three (three)
I try this in Perl:
#ARGV = ("test.txt");
while (<>) {
print (".$_ \($_\)");
}
This doesn't seem to work and this is what I get:
.one
(one
).two
(two
).three
(three
)
Can some help me figure out what's going wrong?
Update :
Thanks to Aureliano Guedes for the suggestion.
This 1-liner seems to work :
perl -pe 's/([^\s]+)/.$1 ($1)/'

$_ will include the newline, e.g. one\n, so print ".$_ \($_\)" becomes something like print ".one\n (one\n).
Use chomp to get rid of them, or use s/\s+\z// to remove all trailing whitespace.
while (<>) {
chomp;
print ".$_ ($_)\n";
}
(But add a \n to print the newline that you do want.)

Besides the correct answer already given, you can do this in a oneliner:
perl -pe 's/(.+)/.$1 ($1)/'
Or if you prefer a while loop:
while (<>) {
s/(.+)/.$1 ($1)/;
print;
}
This simply modifies your current line to your desired output and prints it then.

Another Perl one-liner without using regex.
perl -ple ' $_=".$_ ($_)" '
with the given inputs
$ cat test.txt
one
two
three
$ perl -ple ' $_=".$_ ($_)" ' test.txt
.one (one)
.two (two)
.three (three)
$

Related

Why does the following oneliner skip the first line?

Here is the example:
$cat test.tsv
AAAATTTTCCCCGGGG foo
GGGGCCCCTTTTAAAA bar
$perl -wne 'while(<STDIN>){ print $_;}' <test.tsv
GGGGCCCCTTTTAAAA bar
This should work like cat and not like tail -n +2. What is happening here? And what the correct way?
The use of the -n option creates this (taking from man perlrun):
while (<STDIN>) {
while(<STDIN>){ print $_;} #< your code
}
This shows two while(<STDIN>) instances. They both take all available inputs from STDIN, breaking at newlines.
When you run with a test.tsv which is at least two lines long, the first (outer) use of while(<STDIN>) takes the first line, and the second (inner) one takes the second line - so your print statement is first passed the second line.
If you had more than two lines in test.tsv then the inner loop would print out all lines from the second line onwards.
The correct way to make this work is simply to rely on the -n option you pass to perl:
perl -wne 'print $_;' < test.tsv
Because the -n switch implicitly puts your code inside a loop which goes through the file line by line. Remove the 'n' from the list of switches, or (even better) remove your loop from the code, leave only the print command there.
nbokor#nbokor:~/tmp$ perl -wne 'print $_;' <test.csv
AAAATTTTCCCCGGGG foo
GGGGCCCCTTTTAAAA bar
Remove -n command line option. It duplicates while(<STDIN>){ ... }.
$perl -MO=Deparse -wne 'while(<STDIN>){ print $_;}'
BEGIN { $^W = 1; }
LINE: while (defined($_ = <ARGV>)) {
while (defined($_ = <STDIN>)) {
print $_;
}
}
-e syntax OK

How to add new number into each line?

I have this line about 500 times in a my file backup.xml
my-company-review/</link>
Is there a way through command line, perl, etc. to add a number into the line after the word review. For example, something like this:
my-company-review1/</link>
my-company-review2/</link>
my-company-review3/</link>
Thanks in advance for the help!
Why not use Perl, as I suggested with your last problem. Once again, this is a sort of hack solution, that only works if there's a maximum of one replacement per line... But it's a quick throw-away program.
perl -e '$count=1; foreach (<>) { s/(my-company-review)(\/<\/link>)/$1$count$2/ && $count++; print; }'
An extra loop will do multiple substitutions on a line:
perl -e '$count=1; foreach (<>) { while(s/(my-company-review)(\/<\/link>)/$1$count$2/) {$count++;} print; }'
That awk solution looks way nicer =)
Here's one way:
perl -i -wpe ' BEGIN { $count = 1; }
++$count
if s{(my-company-review)(/</link>)}{$1$count$2};
' backup.xml
(Disclaimer: not tested.)
You can use awk:
awk 'gsub("/</link>", NR "/</link>")' infile
or perl:
perl -ne 's:/</link>:$./</link>:; print' infile

How to compress 4 consecutive blank lines into one single line in Perl

I'm writing a Perl script to read a log so that to re-write the file into a new log by removing empty lines in case of seeing any consecutive blank lines of 4 or more. In other words, I'll have to compress any 4 consecutive blank lines (or more lines) into one single line; but any case of 1, 2 or 3 lines in the file will have to remain the format. I have tried to get the solution online but the only I can find is
perl -00 -pe ''
or
perl -00pe0
Also, I see the example in vim like this to delete blocks of 4 empty lines :%s/^\n\{4}// which match what I'm looking for but it was in vim not Perl. Can anyone help in this? Thanks.
To collapse 4+ consecutive Unix-style EOLs to a single newline:
$ perl -0777 -pi.bak -e 's|\n{4,}|\n|g' file.txt
An alternative flavor using look-behind:
$ perl -0777 -pi.bak -e 's|(?<=\n)\n{3,}||g' file.txt
use strict;
use warnings;
my $cnt = 0;
sub flush_ws {
$cnt = 1 if ($cnt >= 4);
while ($cnt > 0) {print "\n"; $cnt--; }
}
while (<>) {
if (/^$/) {
$cnt++;
} else {
flush_ws();
print $_;
}
}
flush_ws();
Your -0 hint is a good one since you can use -0777 to slurp the whole file in -p mode. Read more about these guys in perlrun So this oneliner should do the trick:
$ perl -0777 -pe 's/\n{5,}/\n\n/g'
If there are up to four new lines in a row, nothing happens. Five newlines or more (four empty lines or more) are replaced by two newlines (one empty line). Note the /g switch here to replace not only the first match.
Deparsed code:
BEGIN { $/ = undef; $\ = undef; }
LINE: while (defined($_ = <ARGV>)) {
s/\n{5,}/\n\n/g;
}
continue {
die "-p destination: $!\n" unless print $_;
}
HTH! :)
One way using GNU awk, setting the record separator to NUL:
awk 'BEGIN { RS="\0" } { gsub(/\n{5,}/,"\n")}1' file.txt
This assumes that you're definition of empty excludes whitespace
This will do what you need
perl -ne 'if (/\S/) {$n = 1 if $n >= 4; print "\n" x $n, $_; $n = 0} else {$n++}' myfile

How do I ignore multiple newlines in perl?

Suppose I have a file with these inputs:
line 1
line 2
line3
My program should only store "line1", "line2" and "line3" not the newlines. How do I achieve that?
My program already removed leading and trailing whitespaces but it doesn't help to remove newline.
I am setting $/ as \n because each input is separated by a \n.
while (<>) {
chomp;
next unless /\S/;
print "$_\n";
}
Set
$/ = q(); # that's an empty string, like "" or ''
while (<>) {
chomp;
...
}
The special value of the defined empty string is how you tell the input operator to treat one or more newlines as the terminator (preferring more), and also to get chomp to remove them all. That way each record always starts with real data.
Perl -n is the equivalent of wrapping while(<>) { } around your script. Assuming that all you need to do is eliminate blank lines, you can do it like this:
#! /usr/bin/perl -n
print unless ( /^$/ );
... On the other hand, if that's all you need to do, you might as well ditch perl and use
grep -n '^$'
Edit: your post says that you want to store values where lines are not blank... in that case, assuming that you don't have too much work to do in the rest of your script, you might do something like this:
#! /usr/bin/perl -n
my #values;
push #values, $_ unless ( /^$/ );
END {
# do whatever work you want to do here
}
... but this quickly reaches a point of limiting returns if you have very much code inside the END{} block.

sed, replace globally a delimiter with the first part of the line

Lets say I have the following lines:
1:a:b:c
2:d:e:f
3:a:b
4:a:b:c:d:e:f
how can I edit this with sed (or perl) in order to read:
1a1b1c
2d2e2f
3a3b
4a4b4c4d4e4f
I have done with awk like this:
awk -F':' '{gsub(/:/, $1, $0); print $0}'
but takes ages to complete! So looking for something faster.
'Tis a tad tricky, but it can be done with sed (assuming the file data contains the sample input):
$ sed '/^\(.\):/{
s//\1/
: retry
s/^\(.\)\([^:]*\):/\1\2\1/
t retry
}' data
1a1b1c
2d2e2f
3a3b
4a4b4c4d4e4f
$
You may be able to flatten the script to one line with semi-colons; sed on MacOS X is a bit cranky at times and objected to some parts, so it is split out into 6 lines. The first line matches lines starting with a single character and a colon and starts a sequence of operations for when that is recognized. The first substitute replaces, for example, '1:' by just '1'. The : retry is a label for branching too - a key part of this. The next substitution copies the first character on the line over the first colon. The t retry goes back to the label if the substitute changed anything. The last line delimits the entire sequence of operations for the initially matched line.
#!/usr/bin/perl
use warnings;
use strict;
while (<DATA>) {
if ( s/^([^:]+)// ) {
my $delim = $1;
s/:/$delim/g;
}
print;
}
__DATA__
1:a:b:c
2:d:e:f
3:a:b
4:a:b:c:d:e:f
use feature qw/ say /;
use strict;
use warnings;
while( <DATA> ) {
chomp;
my #elements = split /:/;
my $interject = shift #elements;
local $" = $interject;
say $interject, "#elements";
}
__DATA__
1:a:b:c
2:d:e:f
3:a:b
4:a:b:c:d:e:f
Or on the linux shell command line:
perl -aF/:/ -pe '$i=shift #F;$_=$i.join $i,#F;' infile.txt