Print Line By Line - perl

I've been trying to work on a lyrical bot for my server, but before I started to work on it, I wanted to give it a test so I came up with this script using the Lyrics::Fetcher module.
use strict;
use warnings;
use Lyrics::Fetcher;
my ($artist, $song) = ('Coldplay', 'Adventures Of A Lifetime');
my $lyrics = Lyrics::Fetcher->fetch($artist, $song, [qw(LyricWiki AstraWeb)]);
my #lines = split("\n\r", $lyrics);
foreach my $line (#lines) {
sleep(10);
print $line;
}
This script works fine, it grabs the lyrics and prints it out in a whole(which is not what I'm looking for).
I was hoping to achieve a line by line print of the lyrics every 10 seconds. Help please?

Your call to split looks suspicious. In particular the regex "\n\r". Note, the first argument to split is always interpreted as a regex regardless of whether you supply a quoted string.
On Unix systems the line ending is typically "\n". On DOS/Windows it's "\r\n" (the reverse of what you have). On ancient Macs it was "\r". To match all thre you could do:
my #lines = split(/\r\n|\n|\r/, $lyrics);

You will need to enable autoflush, otherwise the lines will just be buffered and printed when the buffer is full or when the program terminates
STDOUT->autoflush;
You can use the regex generic newline pattern \R to split on any line ending, whether your data contains CR, LF, or CR LF. This feature is available only in Perl v5.10 or better
my #lines = split /\R/, $lyrics;
And you will need to print a newline after each line of lyrics, because the split will have removed them
print $line, "\n";

Related

Counting records separated by CR/LF (carriage return and newline) in Perl

I'm trying to create a simple script to read a text file that contains records of book titles. Each record is separated with a plain old double space (\r\n\r\n). I need to count how many records are in the file.
For example here is the input file:
record 1
some text
record 2
some text
...
I'm using a regex to check for carriage return and newline, but it fails to match. What am I doing wrong? I'm at my wits' end.
sub readInputFile {
my $inputFile = $_[0]; #read first argument from the commandline as fileName
open INPUTFILE, "+<", $inputFile or die $!; #Open File
my $singleLine;
my #singleRecord;
my $recordCounter = 0;
while (<INPUTFILE>) { # loop through the input file line-by-line
$singleLine = $_;
push(#singleRecord, $singleLine); # start adding each line to a record array
if ($singleLine =~ m/\r\n/) { # check for carriage return and new line
$recordCounter += 1;
createHashTable(#singleRecord); # send record make a hash table
#singleRecord = (); # empty the current record to start a new record
}
}
print "total records : $recordCounter \n";
close(INPUTFILE);
}
It sounds like you are processing a Windows text file on Linux, in which case you want to open the file with the :crlf layer, which will convert all CRLF line-endings to the standard Perl \n ending.
If you are reading Windows files on a Windows platform then the conversion is already done for you, and you won't find CRLF sequences in the data you have read. If you are reading a Linux file then there are no CR characters in there anyway.
It also sounds like your records are separated by a blank line. Setting the built-in input record separator variable $/ to a null string will cause Perl to read a whole record at a time.
I believe this version of your subroutine is what you need. Note that people familiar with Perl will thank you for using lower-case letters and underscore for variables and subroutine names. Mixed case is conventionally reserved for package names.
You don't show create_hash_table so I can't tell what data it needs. I have chomped and split the record into lines, and passed a list of the lines in the record with the newlines removed. It would probably be better to pass the entire record as a single string and leave create_hash_table to process it as required.
sub read_input_file {
my ($input_file) = #_;
open my $fh, '<:crlf', $input_file or die $!;
local $/ = '';
my $record_counter = 0;
while (my $record = <$fh>) {
chomp;
++$record_counter;
create_hash_table(split /\n/, $record);
}
close $fh;
print "Total records : $record_counter\n";
}
You can do this more succinctly by changing Perl's record-separator, which will make the loop return a record at a time instead of a line at a time.
E.g. after opening your file:
local $/ = "\r\n\r\n";
my $recordCounter = 0;
$recordCounter++ while(<INPUTFILE>);
$/ holds Perl's global record-separator, and scoping it with local allows you to override its value temporarily until the end of the enclosing block, when it will automatically revert back to its previous value.
But it sounds like the file you're processing may actually have "\n\n" record-separators, or even "\r\r". You'd need to set the record-separator correctly for whatever file you're processing.
If your files are not huge multi-gigabytes files, the easiest and safest way is to read the whole file, and use the generic newline metacharacter \R.
This way, it also works if some file actually uses LF instead of CRLF (or even the old Mac standard CR).
Use it with split if you also need the actual records:
perl -ln -0777 -e 'my #records = split /\R\R/; print scalar(#records)' $Your_File
Or if you only want to count the records:
perl -ln -0777 -e 'my $count=()=/\R\R/g; print $count' $Your_File
For more details, see also my other answer here to a similar question.

Why is Perl's chomp affecting the output of my print?

It's been a couple months since I've been Perling, but I'm totally stuck on why this is happening...
I'm on OSX, if it matters.
I'm trying to transform lines in a file like
08/03/2011 01:00 PDT,1.11
into stdout lines like
XXX, 20120803, 0100, KWH, 0.2809, A, YYY
Since I'm reading a file, I want to chomp after each line is read in. However, when I chomp, I find my printing gets all messed up. When I don't chomp the printing is fine (except for the extra newline...). What's going on here?
while(<SOURCE>) {
chomp;
my #tokens = split(' |,'); # #tokens now [08/03/2011, 01:00, PDT, 1.11]
my $converted_date = convertDate($tokens[0]);
my $converted_time = convertTime($tokens[1]);
print<<EOF;
$XXX, $converted_date, $converted_time, KWH, $tokens[3], A, YYY
EOF
}
With the chomp call in there, the output is all mixed up:
, A, YYY10803, 0100, KWH, 1.11
Without the chomp call in there, it's at least printing in the right order, but with the extra new line:
XXX, 20110803, 0100, KWH, 1.11
, A, YYY
Notice that with the chomp in there, it's like it overwrites the newline "on top of" the first line. I've added the $|=1; autoflush, but don't know what else to do here.
Thoughts? And thanks in advance....
The lines of your input ends with CR LF. You're removing the LF only. A simple solution is to use the following instead of chomp:
s/\s+\z//;
You could also use the dos2unix command line tool to convert the files before passing them to Perl.
The problem is that you have DOS line-endings and are running on a Unix build of Perl.
One solution to this is to use PerlIO::eol. You may have to install it but there is no need for a use line in the program.
Then you can write
binmode ':raw:eol(LF)', $filehandle;
after which, regardless of the format or source of the file, the lines read will be terminated with the standard "\n".

How to combine two lines together using Perl?

How to combine two lines together using Perl? I'm trying to combine these two lines using a Perl regular expression:
__Data__
test1 - results
dkdkdkdkdkd
I would like the output to be like this:
__Data__
test1 - results dkdkdkdkdkd
I thought this would accomplish this but not working:
$_ =~ s/__Data__\n(test1.*)\n(.*)\n/__Data__\n$1 $2/smg;
If you have a multiline string:
s/__Data__\ntest1.*\K\n//g;
The /s modifier only makes the wildcard . match \n, so it will cause .* to slurp your newline and cause the match of \n to be displaced to the last place it occurs. Which, depending on your data, might be far off.
The /m modifier makes ^ and $ match inside the string at newlines, so not so useful. The \K escape preserves whatever comes before it, so you do not need to put it back afterwards.
If you have a single line string, for instance in a while loop:
while (<>) {
if (/^__Data__/) {
$_ .= <>; # add next line
chomp; # remove newline
$_ .= <>; # add third line
}
print;
}
There seems to be a problem with the setup of $_. When I run this script, I get the output I expect (and the output I think you'd expect). The main difference is that I've added a newline at the end of the replacement pattern in the substitute. The rest is cosmetic or test infrastructure.
Script
#!/usr/bin/env perl
use strict;
use warnings;
my $text = "__Data__\ntest1 - results\ndkdkdkdkdkd\n";
my $copy = $text;
$text =~ s/__Data__\n(test1.*)\n(.*)\n/__Data__\n$1 $2\n/smg;
print "<<$copy>>\n";
print "<<$text>>\n";
Output
<<__Data__
test1 - results
dkdkdkdkdkd
>>
<<__Data__
test1 - results dkdkdkdkdkd
>>
Note the use of << and >> to mark the ends of strings; it often helps when debugging. Use any symbols you like; just enclose your displayed text in such markers to help yourself debug what's going on.
(Tested with Perl 5.12.1 on RHEL 5 for x86/64, but I don't think the code is version or platform dependent.)

Clarification on chomp

I'm on break from classes right now and decided to spend my time learning Perl. I'm working with Beginning Perl (http://www.perl.org/books/beginning-perl/) and I'm finishing up the exercises at the end of chapter three.
One of the exercises asked that I "Store your important phone numbers in a hash. Write a program to look up numbers by the person's name."
Anyway, I had come up with this:
#!/usr/bin/perl
use warnings;
use strict;
my %name_number=
(
Me => "XXX XXX XXXX",
Home => "YYY YYY YYYY",
Emergency => "ZZZ ZZZ ZZZZ",
Lookup => "411"
);
print "Enter the name of who you want to call (Me, Home, Emergency, Lookup)", "\n";
my $input = <STDIN>;
print "$input can be reached at $name_number{$input}\n";
And it just wouldn't work. I kept getting this error message:
Use of uninitialized value in concatenation (.) or string at hello.plx
line 17, line 1
I tried playing around with the code some more but each "solution" looked more complex than the "solution" that came before it. Finally, I decided to check the answers.
The only difference between my code and the answer was the presence of chomp ($input); after <STDIN>;.
Now, the author has used chomp in previous example but he didn't really cover what chomp was doing. So, I found this answer on www.perlmeme.org:
The chomp() function will remove (usually) any newline character from
the end of a string. The reason we say usually is that it actually
removes any character that matches the current value of $/ (the input
record separator), and $/ defaults to a newline..
Anyway, my questions are:
What newlines are getting removed? Does Perl automatically append a "\n" to the input from <STDIN>? I'm just a little unclear because when I read "it actually removes any character that matches the current value of $/", I can't help but think "I don't remember putting a $/ anywhere in my code."
I'd like to develop best practices as soon as possible - is it best to always include chomp after <STDIN> or are there scenarios where it's unnecessary?
<STDIN> reads to the end of the input string, which contains a newline if you press return to enter it, which you probably do.
chomp removes the newline at the end of a string. $/ is a variable (as you found, defaulting to newline) that you probably don't have to worry about; it just tells perl what the 'input record separator' is, which I'm assuming means it defines how far <FILEHANDLE> reads. You can pretty much forget about it for now, it seems like an advanced topic. Just pretend chomp chomps off a trailing newline. Honestly, I've never even heard of $/ before.
As for your other question, it is generally cleaner to always chomp variables and add newlines as needed later, because you don't always know if a variable has a newline or not; by always chomping variables you always get the same behavior. There are scenarios where it is unnecessary, but if you're not sure it can't hurt to chomp it.
Hope this helps!
OK, as of 1), perl doesn't add any \n at input. It is you that hit Enter when finished entering the number. If you don't specify $/, a default of \n will be put (under UNIX, at least).
As of 2), chomp will be needed whenever input comes from the user, or whenever you want to remove the line ending character (reading from a file, for example).
Finally, the error you're getting may be from perl not understanding your variable within the double quotes of the last print, because it does have a _ character. Try to write the string as follows:
print "$input can be reached at ${name_number{$input}}\n";
(note the {} around the last variable).
<STDIN> is a short-cut notation for readline( *STDIN );. What readline() does is reads the file handle until it encounters the contents of $/ (aka $INPUT_RECORD_SEPARATOR) and returns everything it has read including the contents of $/. What chomp() does is remove the last occurrence contents of $/, if present.
The contents is often called a newline character but it may be composed of more than one character. On Linux, it contains a LF character but on Windows, it contains CR-LF.
See:
perldoc -f readline
perldoc -f chomp
perldoc perlvar and search for /\$INPUT_RECORD_SEPARATOR/
I think best practice here is to write:
chomp(my $input = <STDIN>);
Here is quick example how chomp function ($/ meaning is explained there) works removing just one trailing new line (if any):
chomp (my $input = "Me\n"); # OK
chomp ($input = "Me"); # OK (nothing done)
chomp ($input = "Me\n\n"); # $input now is "Me\n";
chomp ($input); # finally "Me"
print "$input can be reached at $name_number{$input}\n";
BTW: That's funny thing is that I am learning Perl too and I reached hashes five minutes ago.
Though it may be obvious, it's still worth mentioning why the chomp is needed here.
The hash created contains 4 lookup keys: "Me", "Home", "Emergency" and "Lookup"
When $input is specified from <STDIN>, it'll contain "Me\n", "Me\r\n" or some other line-ending variant depending on what operating system is being used.
The uninitialized value error comes about because the "Me\n" key does not exist in the hash. And this is why the chomp is needed:
my $input = <STDIN>; # "Me\n" --> Key DNE, $name_number{$input} not defined
chomp $input; # "Me" --> Key exists, $name_number{$input} defined

Why does STDIN cause my Perl program to freeze?

I am learning Perl and wrote this script to practice using STDIN. When I run the script, it only shows the first print statement on the console. No matter what I type in, including new lines, the console doesn't show the next print statement. (I'm using ActivePerl on a Windows machine.) It looks like this:
$perl script.pl
What is the exchange rate? 90.45
[Cursor stays here]
This is my script:
#!/user/bin/perl
use warnings; use strict;
print "What is the exchange rate? ";
my #exchangeRate = <STDIN>;
chomp(#exchangeRate);
print "What is the value you would like to convert? ";
chomp(my #otherCurrency = <STDIN>);
my #result = #otherCurrency / #exchangeRate;
print "The result is #{result}.\n";
One potential solution I noticed while researching my problem is that I could include use IO::Handle; and flush STDIN; flush STDOUT; in my script. These lines did not solve my problem, though.
What should I do to have STDIN behave normally? If this is normal behavior, what am I missing?
When you do
my #answer = <STDIN>;
...Perl waits for the EOF character (on Unix and Unix-like it's Ctrl-D). Then, each line you input (separated by linefeeds) go into the list.
If you instead do:
my $answer = <STDIN>;
...Perl waits for a linefeed, then puts the string you entered into $answer.
I found my problem. I was using the wrong type of variable. Instead of writing:
my #exchangeRate = <STDIN>;
I should have used:
my $exchangeRate = <STDIN>;
with a $ instead of a #.
To end multiline input, you can use Control-D on Unix or Control-Z on Windows.
However, you probably just wanted a single line of input, so you should have used a scalar like other people mentioned. Learning Perl walks you through this sort of stuff.
You could try and enable autoflush.
Either
use IO::Handle;
STDOUT->autoflush(1);
or
$| = 1;
That's why you are not seeing the output printed.
Also, you need to change from arrays '#' to scalar variables '$'
$val = <STDIN>;
chomp($val);