Perl output overwrites itself - perl

I have code which loops through lines in a file, and tries to print out each line with something added at the start and end.
However, I get output like this: "nominalte cows".
Basically, the bit after the line (nominal) overwrites the start of it. I know that removing the chomp and regex lines stops this effect, but I need it to be on one line without spaces. Where am I going wrong?
while ($line = <INPUT>) {
chomp $line;
$line =~ s/ //g;
printf "\#attribute %s nominal\n", $line;
}

Your input file is probably from MS Windows with end of line encoded as CR-LF. You can also just s/\r// to remove the CR.

You might have \r in your variable. Try using \s:
$line =~ s/\s//g;
See perlre for the meaning of \s.

Related

How to remove newline from the end of a file using Perl

I have a file that reads like this:
dog cat mouse
apple orange pear
red yellow green
There is a tab \t separating the words on each row, and a newline \n separating each of the rows. Below the last line, red yellow green there is a blank line due to a newline \n after green.
I would like to use Perl to remove the newline.
I have seen a few articles like this How can I delete a newline if it is the last character in a file? that give solutions for Perl, but I would like to do this in hard code so that I can incorporate it into my Perl script.
I don't know if this might be possible using chomp, or if chomp works on each line separately (I would like to keep the newline between lines).
Also I have seen previously comments that suggest maintaining a newline at the end of a file because Unix commands work better when a file ends with a newline. However, I have created a script which relies on input files not ending with a newline, therefore I really feel removing the newlines is necessary for my work.
You can try this:
perl -pe 'chomp if eof' file.txt
Here is another simple way, if you need it in a script:
open $fh, "file.txt";
#lines=<$fh>; # read all lines and store in array
close $fh;
chomp $lines[-1]; # remove newline from last line
print #lines;
Or something like this (in script), as suggested by jnhc for the command line:
open $fh, "file.txt";
while (<$fh>) {
chomp if eof $fh;
print;
}
close $fh;

Print Line By Line

I've been trying to work on a lyrical bot for my server, but before I started to work on it, I wanted to give it a test so I came up with this script using the Lyrics::Fetcher module.
use strict;
use warnings;
use Lyrics::Fetcher;
my ($artist, $song) = ('Coldplay', 'Adventures Of A Lifetime');
my $lyrics = Lyrics::Fetcher->fetch($artist, $song, [qw(LyricWiki AstraWeb)]);
my #lines = split("\n\r", $lyrics);
foreach my $line (#lines) {
sleep(10);
print $line;
}
This script works fine, it grabs the lyrics and prints it out in a whole(which is not what I'm looking for).
I was hoping to achieve a line by line print of the lyrics every 10 seconds. Help please?
Your call to split looks suspicious. In particular the regex "\n\r". Note, the first argument to split is always interpreted as a regex regardless of whether you supply a quoted string.
On Unix systems the line ending is typically "\n". On DOS/Windows it's "\r\n" (the reverse of what you have). On ancient Macs it was "\r". To match all thre you could do:
my #lines = split(/\r\n|\n|\r/, $lyrics);
You will need to enable autoflush, otherwise the lines will just be buffered and printed when the buffer is full or when the program terminates
STDOUT->autoflush;
You can use the regex generic newline pattern \R to split on any line ending, whether your data contains CR, LF, or CR LF. This feature is available only in Perl v5.10 or better
my #lines = split /\R/, $lyrics;
And you will need to print a newline after each line of lyrics, because the split will have removed them
print $line, "\n";

Perl 5.12.3 fails to loop CSV file line by line

I'm sure someone has an explanation as to what is happening with the following script:
Please note, the file I specify is available and is opening. I know this because the last line of the file is output when the program is run, but it is only the last line.
Note about the .csv file: it's generated on windows (I'm using OS X 10.7.4 with Perl 5.12.3) and uses \r line breaks. I attempted to tell perl that the line break character was \r at the top of the script but it does not work. I know they're \r as the grep search finds them in a text editor.
The script runs and only prints the last line of the file. If I plug in a regular expression it will grab the first matching field from the first line and echo it fine, but I cannot iterate over the entire file.
Any clarification is appreciated as I am new to perl.
#!/usr/bin/perl
use warnings;
print "Please enter your filename:";
my ($dataline);
open(INFO,'./expensereport.csv') || die("can't open datafile: $!");
while (my $line = <INFO>) {
chomp $line;
print $line;
}
print $!;
The carriage returns without linefeed are causing print to overwrite each line on the same line, so all you see is the last.
Run dos2unix on your input file before processing.
There are several ways to tell perl that your input file is windows-style :crlf.
perldoc -f binmode or perldoc -f open
open(INFO, '<:crlf', './expensereport.csv')
...
Ahh, that's clear! :)
Look, you have a file with \r (carriage return, literally) and \n (newline). chomp cuts off \n (new line). So you print over the same line (remember "carriage return") again and again.
Use print "$line\n"; instead

How to remove all lines except .c extention(at last) lines using perl scripting

I've a string $string which has got list of lines, some ending with *.c, *.pdf,etc and few without any extensions(these are directories). I need to remove all lines except *.c lines. How can i do that using regular expression? I've written to get removed *.c files as below but how to do a not of it?
next if $line =~ /(\.c)/i;
Any ideas.
thanks,
Sharath
Use unless instead of if to reverse the sense of the condition.
next unless $line =~ /\.c$/i;
or simply invert the test:
next if $line !~ /\.c$/i;
Also, you don't need parentheses around the regexp, and you need $ to anchor it to the end of the line.

In perl pattern matching..how to exclude the \n character from pattern

I am new to perl and writing my first few programs and using its pattern matching abilities. I am reading a file into array like this:
#list=<file>
Then indexing each line of array by $list[0..9] etc, and when I match it against a pattern, the $list[0] includes \n character, hence the match fails. So if ($string =~ $list[0]) fails though without \n character in pattern it would match.
How do I tell pattern matcher to not consider the \n character from pattern?
Thanks
You can shave the line ends from the array after reading:
#lines = …;
chomp #lines;
Now #lines contains the lines without line ends. See perldoc chomp for details.
If you want to remove the \n from your lines you can:
chomp $list[0]
see perldoc -f chomp for the details.
This is a good opportunity to get to know how Perl modules work.
You can for example use Perl6::Slurp which will both a) parse the file b) put the contents in an array c) remove the newline characters for you.
For example:
use Perl6::Slurp;
my #lines = slurp '<:utf8', 'filename', {chomp=>"\n"}
This will match with the \n:
if ( $list[0] =~ "$string\n")
Or if you want the \n to be optional:
if ( $list[0] =~ /$string\n?/ )