Programmatically read from STDIN or input file in Perl - perl

What is the slickest way to programatically read from stdin or an input file (if provided) in Perl?

while (<>) {
print;
}
will read either from a file specified on the command line or from stdin if no file is given
If you are required this loop construction in command line, then you may use -n option:
$ perl -ne 'print;'
Here you just put code between {} from first example into '' in second

This provides a named variable to work with:
foreach my $line ( <STDIN> ) {
chomp( $line );
print "$line\n";
}
To read a file, pipe it in like this:
program.pl < inputfile

The "slickest" way in certain situations is to take advantage of the -n switch. It implicitly wraps your code with a while(<>) loop and handles the input flexibly.
In slickestWay.pl:
#!/usr/bin/perl -n
BEGIN: {
# do something once here
}
# implement logic for a single line of input
print $result;
At the command line:
chmod +x slickestWay.pl
Now, depending on your input do one of the following:
Wait for user input
./slickestWay.pl
Read from file(s) named in arguments (no redirection required)
./slickestWay.pl input.txt
./slickestWay.pl input.txt moreInput.txt
Use a pipe
someOtherScript | ./slickestWay.pl
The BEGIN block is necessary if you need to initialize some kind of object-oriented interface, such as Text::CSV or some such, which you can add to the shebang with -M.
-l and -p are also your friends.

You need to use <> operator:
while (<>) {
print $_; # or simply "print;"
}
Which can be compacted to:
print while (<>);
Arbitrary file:
open my $F, "<file.txt" or die $!;
while (<$F>) {
print $_;
}
close $F;

If there is a reason you can't use the simple solution provided by ennuikiller above, then you will have to use Typeglobs to manipulate file handles. This is way more work. This example copies from the file in $ARGV[0] to that in $ARGV[1]. It defaults to STDIN and STDOUT respectively if files are not specified.
use English;
my $in;
my $out;
if ($#ARGV >= 0){
unless (open($in, "<", $ARGV[0])){
die "could not open $ARGV[0] for reading.";
}
}
else {
$in = *STDIN;
}
if ($#ARGV >= 1){
unless (open($out, ">", $ARGV[1])){
die "could not open $ARGV[1] for writing.";
}
}
else {
$out = *STDOUT;
}
while ($_ = <$in>){
$out->print($_);
}

Do
$userinput = <STDIN>; #read stdin and put it in $userinput
chomp ($userinput); #cut the return / line feed character
if you want to read just one line

Here is how I made a script that could take either command line inputs or have a text file redirected.
if ($#ARGV < 1) {
#ARGV = ();
#ARGV = <>;
chomp(#ARGV);
}
This will reassign the contents of the file to #ARGV, from there you just process #ARGV as if someone was including command line options.
WARNING
If no file is redirected, the program will sit their idle because it is waiting for input from STDIN.
I have not figured out a way to detect if a file is being redirected in yet to eliminate the STDIN issue.

if(my $file = shift) { # if file is specified, read from that
open(my $fh, '<', $file) or die($!);
while(my $line = <$fh>) {
print $line;
}
}
else { # otherwise, read from STDIN
print while(<>);
}

Related

How to re-direct the contents to a file instead of printing it to terminal

How to avoid printing the contents to terminal (As it will take more time if it is 20k lines in my case) and instead redirect it to a file in perl?
This is just a sample and not the entire code:
if ($count eq $length)
{
push(#List,$line);
print "$line\n"; #Prints line to terminal which is time consuming
}
I tried below but it did not work
if ($cnt eq $redLen)
{
push(#List,$line);
print $line > "/home/vibes/text";
}
Please let me know if my question is not clear?
Simply use the 3 argument open method.
use strict;
use warnings;
my $line = "Hello Again!";
open (my $fh, ">", "/home/vibes/text") || die "Failed to open /home/vibes/text $!";
print $fh "$line\n";
close($fh); # Always close opened files.
The default filehandle in perl is STDOUT. You can change it with a call to select:
print "Hello\n"; # goes to stdout
open my $fh, '>', '/home/vibes/text';
select($fh);
print "World\n"; # goes to file '/home/vibes/text'
From your shell, output redirection is usually a matter of appending > file to your command. This is true in both Unix-y systems and on Windows.
$ perl my_script.pl > /home/vibes/text

Giving both input & output file at command line

Si I have this line in the perl script which prints the output to the STDOUT/console
printf "Line no. $i"
What code shall I include in the program to direct this output to an output file given by user at the command line itself (as undermentioned)
Right now ,the following portion asks the user for input file:
print "enter file name";
chomp(my $file=<STDIN>);
open(DATA,$file) or die "error reading";
But I dont want to ask the user for either of input/output file.
What I want is a way in which user could give in the input as well as output file from command line while running the program.
perl input_file output_file program.pl
What code shall i just include for this.
You can use shift to read the command line arguments to your script. shift reads and removes the first element of an array. If no array is specified (and not inside a subroutine), it will implicitly read from #ARGV, which contains the list of arguments passed to your script. For example:
use strict;
use warnings;
use autodie;
# check that two arguments have been passed
die "usage: $0 input output\n" unless #ARGV == 2;
my $infile = shift;
my $outfile = shift;
# good idea to sanitise the arguments here
open my $in, "<", $infile;
open my $out, ">", $outfile;
while (<$in>) {
print $out $_;
}
close $in;
close $out;
You could call this script like perl script.pl input_file output_file and it would copy the contents of input_file to output_file.
The easiest approach here is to ignore input and output files within your program. Just read from STDIN and write to STDOUT. Let the user redirect those filehandles when calling your program.
Your program looks something like this:
#!/usr/bin/perl
use strict;
use warnings;
while (<STDIN>) {
# do something useful to the data in $_
print;
}
And you call it like this:
$ ./your_program.pl inputfile.txt > outputfile.txt
This is known as the "Unix Filter Model" and it's the most flexible way to write programs that read input and produce output.
You can use #ARGV variable ,
use strict ;
use warnings ;
if ( #ARGV != 2 )
{
print "Usage : <program.pl> <input> <output>\n" ;
exit ;
}
open my $Input,$ARGV[0] or die "error:$!\n" ;
open my $Output,">>" .$ARGV[1] or die "error:$!\n";
print $Output $_ while (<$Input> ) ;
close ($Input) ;
close ($Output) ;
Note:
You should run the program perl program.pl input_file output_file this format.

How to read from a file and direct output to a file if a file name is given in the command line, and printing to console if no argument given

I made a file, "rootfile", that contains paths to certain files and the perl program mymd5.perl gets the md5sum for each file and prints it in a certain order. How do I redirect the output to a file if a name is given in the command line? For instance if I do
perl mymd5.perl md5file
then it will feed output to md5file. And if I just do
perl mydm5.perl
it will just print to the console.
This is my rootfile:
/usr/local/courses/cs3423/assign8/cmdscan.c
/usr/local/courses/cs3423/assign8/driver.c
/usr/local/courses/cs3423/assign1/xpostitplus-2.3-3.diff.gz
This is my program right now:
open($in, "rootfile") or die "Can't open rootfile: $!";
$flag = 0;
if ($ARGV[0]){
open($out,$ARGV[0]) or die "Can't open $ARGV[0]: $!";
$flag = 1;
}
if ($flag == 1) {
select $out;
}
while ($line = <$in>) {
$md5line = `md5sum $line`;
#md5arr = split(" ",$md5line);
if ($flag == 0) {
printf("%s\t%s\n",$md5arr[1],$md5arr[0]);
}
}
close($out);
If you don't give a FILEHANDLE to print or printf, the output will go to STDOUT (the console).
There are several way you can redirect the output of your print statements.
select $out; #everything you print after this line will go the file specified by the filehandle $out.
... #your print statements come here.
close $out; #close connection when done to avoid counfusing the rest of the program.
#or you can use the filehandle right after the print statement as in:
print $out "Hello World!\n";
You can print a filename influenced by the value in #ARGV as follows:
This will take the name of the file in $ARGV[0] and use it to name a new file, edit.$ARGV[0]
#!/usr/bin/perl
use warnings;
use strict;
my $file = $ARGV[0];
open my $input, '<', $file or die $!;
my $editedfile = "edit.$file";
open my $name_change, '>', $editedfile or die $!;
if ($input eq "md5file"){
while ($in){
# Do something...
print $name_change "$_\n";
}
}
Perhaps the following will be helpful:
use strict;
use warnings;
while (<>) {
my $md5line = `md5sum $_`;
my #md5arr = split( " ", $md5line );
printf( "%s\t%s\n", $md5arr[1], $md5arr[0] );
}
Usage: perl mydm5.pl rootfile [>md5file]
The last, optional parameter will direct output to the file md5file; if absent, the results are printed to the console.

foreach and special variable $_ not behaving as expected

I'm learning Perl and wrote a small script to open perl files and remove the comments
# Will remove this comment
my $name = ""; # Will not remove this comment
#!/usr/bin/perl -w <- wont remove this special comment
The name of files to be edited are passed as arguments via terminal
die "You need to a give atleast one file-name as an arguement\n" unless (#ARGV);
foreach (#ARGV) {
$^I = "";
(-w && open FILE, $_) || die "Oops: $!";
/^\s*#[^!]/ || print while(<>);
close FILE;
print "Done! Please see file: $_\n";
}
Now when I ran it via Terminal:
perl removeComments file1.pl file2.pl file3.pl
I got the output:
Done! Please see file:
This script is working EXACTLY as I'm expecting but
Issue 1 : Why $_ didn't print the name of the file?
Issue 2 : Since the loop runs for 3 times, why Done! Please see file: was printed only once?
How you would write this script in as few lines as possible?
Please comment on my code as well, if you have time.
Thank you.
The while stores the lines read by the diamond operator <> into $_, so you're writing over the variable that stores the file name.
On the other hand, you open the file with open but don't actually use the handle to read; it uses the empty diamond operator instead. The empty diamond operator makes an implicit loop over files in #ARGV, removing file names as it goes, so the foreach runs only once.
To fix the second issue you could use while(<FILE>), or rewrite the loop to take advantage of the implicit loop in <> and write the entire program as:
$^I = "";
/^\s*#[^!]/ || print while(<>);
Here's a more readable approach.
#!/usr/bin/perl
# always!!
use warnings;
use strict;
use autodie;
use File::Copy;
# die with some usage message
die "usage: $0 [ files ]\n" if #ARGV < 1;
for my $filename (#ARGV) {
# create tmp file name that we are going to write to
my $new_filename = "$filename\.new";
# open $filename for reading and $new_filename for writing
open my $fh, "<", $filename;
open my $new_fh, ">", $new_filename;
# Iterate over each line in the original file: $filename,
# if our regex matches, we bail out. Otherwise we print the line to
# our temporary file.
while(my $line = <$fh>) {
next if $line =~ /^\s*#[^!]/;
print $new_fh $line;
}
close $fh;
close $new_fh;
# use File::Copy's move function to rename our files.
move($filename, "$filename\.bak");
move($new_filename, $filename);
print "Done! Please see file: $filename\n";
}
Sample output:
$ ./test.pl a.pl b.pl
Done! Please see file: a.pl
Done! Please see file: b.pl
$ cat a.pl
#!/usr/bin/perl
print "I don't do much\n"; # comments dont' belong here anyways
exit;
print "errrrrr";
$ cat a.pl.bak
#!/usr/bin/perl
# this doesn't do much
print "I don't do much\n"; # comments dont' belong here anyways
exit;
print "errrrrr";
Its not safe to use multiple loops and try to get the right $_. The while Loop is killing your $_. Try to give your files specific names inside that loop. You can do this with so:
foreach my $filename(#ARGV) {
$^I = "";
(-w && open my $FILE,'<', $filename) || die "Oops: $!";
/^\s*#[^!]/ || print while(<$FILE>);
close FILE;
print "Done! Please see file: $filename\n";
}
or that way:
foreach (#ARGV) {
my $filename = $_;
$^I = "";
(-w && open my $FILE,'<', $filename) || die "Oops: $!";
/^\s*#[^!]/ || print while(<$FILE>);
close FILE;
print "Done! Please see file: $filename\n";
}
Please never use barewords for filehandles and do use a 3-argument open.
open my $FILE, '<', $filename — good
open FILE $filename — bad
Simpler solution: Don't use $_.
When Perl was first written, it was conceived as a replacement for Awk and shell, and Perl heavily borrowed from that syntax. Perl also for readability created the special variable $_ which allowed you to use various commands without having to create variables:
while ( <INPUT> ) {
next if /foo/;
print OUTPUT;
}
The problem is that if everything is using $_, then everything will effact $_ in many unpleasant side effects.
Now, Perl is a much more sophisticated language, and has things like locally scoped variables (hint: You don't use local to create these variables -- that merely gives _package variables (aka global variables) a local value.)
Since you're learning Perl, you might as well learn Perl correctly. The problem is that there are too many books that are still based on Perl 3.x. Find a book or web page that incorporates modern practice.
In your program, $_ switches from the file name to the line in the file and back to the next file. It's what's confusing you. If you used named variables, you could distinguished between files and lines.
I've rewritten your program using more modern syntax, but your same logic:
use strict;
use warnings;
use autodie;
use feature qw(say);
if ( not $ARGV[0] ) {
die "You need to give at least one file name as an argument\n";
}
for my $file ( #ARGV ) {
# Remove suffix and copy file over
if ( $file =~ /\..+?$/ ) {
die qq(File "$file" doesn't have a suffix);
}
my ( $output_file = $file ) =~ s/\..+?$/./; #Remove suffix for output
open my $input_fh, "<", $file;
open my $output_fh, ">", $output_file;
while ( my $line = <$input_fh> ) {
print {$output_fh} $line unless /^\s*#[^!]/;
}
close $input_fh;
close $output_fh;
}
This is a bit more typing than your version of the program, but it's easier to see what's going on and maintain.

Filehandle open() and the split variable

I am a beginner in Perl.
What I do not understand is the following:
To write a script that can:
Print the lines of the file $source with a comma delimiter.
Print the formatted lines to an output file.
Allow this output file to be specified in command-line.
Code:
my ( $source, $outputSource ) = #ARGV;
open( INPUT, $source ) or die "Unable to open file $source :$!";
Question: I do not understand how one can specify in the command line, upon starting to write the code the text of the output file.
I would rely on redirection operator in the shell instead, such as:
script.pl input.txt > output.txt
Then it is a simple case of doing this:
use strict;
use warnings;
while (<ARGV>) {
s/\n/,/;
print;
}
Then you can even merge several files with script.pl input1.txt input2.txt ... > output_all.txt. Or just do one file at the time, with one argument.
If I understood your question right I hope this example can help.
Program:
use warnings;
use strict;
## Check input and output file as arguments in command line.
die "Usage: perl $0 input-file output-file\n" unless #ARGV == 2;
my ( $source, $output_source ) = #ARGV;
## Open both files, one for reading and other for writing.
open my $input, "<", $source or
die "Unable to open file $source : $!\n";
open my $output, ">", $output_source or
die "Unable to open file $output_source : $!\n";
## Read all file line by line, substitute the end of line with a ',' and print
## to output file.
while ( my $line = <$input> ) {
$line =~ tr/\n/,/;
printf $output "%s", $line;
}
close $input;
close $output;
Execution:
$ perl script.pl infile outfile