So what exactly does <FILE> do? - perl

So, I've used <FILE> a large number of times. A simple example would be:
open (FILE, '<', "someFile.txt");
while (my $line = <FILE>){
print $line;
}
So, I had thought that using <FILE> would take a a part of a file at a time (a line specifically) and use it, and when it was called on again, it would go to the next line. And indeed, whenever I set <FILE> to a scalar, that's exactly what it would do. But, when I told the computer a line like this one:
print <FILE>;
it printed the entire file, newlines and all. So my question is, what does the computer think when it's passed <FILE>, exactly?

Diamond operator <> used to read from file is actually built-in readline function.
From perldoc -f readline
Reads from the filehandle whose typeglob is contained in EXPR (or from *ARGV if EXPR is not provided). In scalar context, each call reads and returns the next line until end-of-file is reached, whereupon the subsequent call returns undef. In list context, reads until end-of-file is reached and returns a list of lines.
If you would like to check particular context in perl,
sub context { return wantarray ? "LIST" : "SCALAR" }
print my $line = context(), "\n";
print my #array = context(), "\n";
print context(), "\n";
output
SCALAR
LIST
LIST

It depends if it's used in a scalar context or a list context.
In scalar context: my $line = <file> it reads one line at a tie.
In list context: my #lines = <FILE> it reads the whole file.
When you say print <FILE>; it's list context.

The behaviour is different depending on what context it is being evaluated in:
my $scalar = <FILE>; # Read one line from FILE into $scalar
my #array = <FILE>; # Read all lines from FILE into #array
As print takes a list argument, <FILE> is evaluated in list context and behaves in the latter way.

Related

open multiline text file in perl

I have the following text file
Hello my
name is
Jeff
and I want to open it using perl. However, when using this code
use strict;
use warnings;
open(my $fh, "<", "input.txt") or die "cannot open input text!";
my $text= <$fh>;
print $text;
The output text is only Hello my. How can I make it print all of my input text lines and not just the first one ?
If you read the section on I/O operators in perldoc perlop, you will see the following:
In scalar context, evaluating a filehandle in angle brackets yields the next line from that file
And, a bit further down:
If a <FILEHANDLE> is used in a context that is looking for a list, a list comprising all input lines is returned, one line per list element.
If you assign <FILEHANDLE> to a scalar variable, you will get the next line (actually record) from the file. This is what you are doing:
my $text = <$fh>;
If you assign <FILEHANDLE> to an array, you will get all of the remaining data from the file, with each line (actually record) in a separate element of the array.
my #text = <$fh>;
If you want to get all data from a file into a scalar, there are a few approaches you can take. The naive approach is to read the data in list context and then join it using an empty string:
my $text = join '', <$fh>;
You can use $/ to change Perl's idea of a "record" (often done in a do block):
my $text = do { local $/; <$fh> };
I like the slurp method from Path::Tiny.
use Path::Tiny;
my $text = path($filename)->slurp;
In scalar context, evaluating a filehandle in angle brackets yields the next line from that file
so you can use while loop to read the lines (preferred method)
while (<$fh>) #output will store into default variable $_
{
print ;
}
Or you can use the array but you are trying to storing the all content of the file at time in the variable
my #array = <$fh>;
And the slurp mode to read all file in the scalar variable
my $multiline = do { local $/; <$fh>; } #$/ input record seprator

reading and printing a text file in Perl

I have simple question:
why the first code does not print the first line of the file but the second one does?
#! /usr/bin/perl
use warnings;
use strict;
my $protfile = "file.txt";
open (FH, $protfile);
while (<FH>) {
print (<FH>);
}
#! /usr/bin/perl
use warnings;
use strict;
my $protfile = "file.txt";
open (FH, $protfile);
while (my $file = <FH>) {
print ("$file");
}
Context.
Your first program tests for end-of-file on FH by reading the first line, then reads FH in list context as an argument to print. That translates to the whole file, as a list with one line per item. It then tests for EOF again, most likely detects it, and stops.
Your second program iterates by line, each one read in scalar context to variable $file, and prints them individually. It detects EOF by a special case in the while syntax. (see the code samples in the documentation)
So the specific reason why your program doesn't print the first line in one case is that it's lost in the argument to while. Do note that the two programs' structure is pretty different: the first only runs a single while iteration, while the second iterates once per line.
PS: nowadays, the recommended way to manage files tends towards lexical filehandles (open my $file, 'name'; print <$file>;).
Because you are comsuming the first line with the <> operator and then using it again in the print, so the first line has already gone but you are not printing it. <> is the readline operator. You need to print the $_ variable, or assign it to a defined variable as you are doing in the second code. You could rewrite the first:
print;
And it would work, because print uses $_ if you don't give it anything.
When used in scalar context, <FH> returns the next single line from the file.
When used in list context, <FH> returns a list of all remaining lines in the file.
while (my $file = <FH>) is a scalar context, since you're assigning to a scalar. while (<FH>) is short for while(defined($_ = <FH>)), so it is also a scalar context. print (<FH>); makes it a list context, since you're using it as argument to a function that can take multiple arguments.
while (<FH>) {
print (<FH>);
}
The while part reads the first line into $_ (which is never used again). Then the print part reads the rest of the lines all at once, then prints them all out again. Then the while condition is checked again, but since there are now no lines left, <FH> returns undef and the loop quits after just one iteration.
while (my $file = <FH>) {
print ("$file");
}
does more what you probably expect: reads and then prints one line during each iteration of the loop.
By the way, print $file; does the same as print ("$file");
while (<FH>) {
print (<FH>);
}
use this instead:
while (<FH>) {
print $_;
}

Why " print readdir(DIR_HANDLE); " will pop out many files?

I use readdir(DIR) to read a file , but when I use
$file = readdir(DIR);
print $file;
print "\n";
sleep(2);
it will print a file one time;
but when I use
print readdir(DIR);
print "\n";
sleep(2);
it pop out many files
what's wrong with it?
thanks
readdir does not read a file. It scans a directory for the next directory listing.
You can check out the perldoc for it here: readdir
The reason it printed only one file with your declaration of $file is because it is a scalar value. It will only read from the directory handle once and return a listing.
More commonly when you want to read from an entire directory, you assign it to a list which is what readdir returns thus printing all the directory listings in your second example.
readdir returns the next file when evaluated in scalar context (or undef after the last one has been read).
my $file = readdir($fh);
The scalar assign operator evaluates its RHS operand in scalar context.
readdir returns the remaining files when evaluated in list context.
my #files = readdir($fh);
print evaluate its argument list in list context.

How to print variables in Perl

I have some code that looks like
my ($ids,$nIds);
while (<myFile>){
chomp;
$ids.= $_ . " ";
$nIds++;
}
This should concatenate every line in my myFile, and nIds should be my number of lines. How do I print out my $ids and $nIds?
I tried simply print $ids, but Perl complains.
my ($ids, $nIds)
is a list, right? With two elements?
print "Number of lines: $nids\n";
print "Content: $ids\n";
How did Perl complain? print $ids should work, though you probably want a newline at the end, either explicitly with print as above or implicitly by using say or -l/$\.
If you want to interpolate a variable in a string and have something immediately after it that would looks like part of the variable but isn't, enclose the variable name in {}:
print "foo${ids}bar";
You should always include all relevant code when asking a question. In this case, the print statement that is the center of your question. The print statement is probably the most crucial piece of information. The second most crucial piece of information is the error, which you also did not include. Next time, include both of those.
print $ids should be a fairly hard statement to mess up, but it is possible. Possible reasons:
$ids is undefined. Gives the warning undefined value in print
$ids is out of scope. With use
strict, gives fatal warning Global
variable $ids needs explicit package
name, and otherwise the undefined
warning from above.
You forgot a semi-colon at the end of
the line.
You tried to do print $ids $nIds,
in which case perl thinks that $ids
is supposed to be a filehandle, and
you get an error such as print to
unopened filehandle.
Explanations
1: Should not happen. It might happen if you do something like this (assuming you are not using strict):
my $var;
while (<>) {
$Var .= $_;
}
print $var;
Gives the warning for undefined value, because $Var and $var are two different variables.
2: Might happen, if you do something like this:
if ($something) {
my $var = "something happened!";
}
print $var;
my declares the variable inside the current block. Outside the block, it is out of scope.
3: Simple enough, common mistake, easily fixed. Easier to spot with use warnings.
4: Also a common mistake. There are a number of ways to correctly print two variables in the same print statement:
print "$var1 $var2"; # concatenation inside a double quoted string
print $var1 . $var2; # concatenation
print $var1, $var2; # supplying print with a list of args
Lastly, some perl magic tips for you:
use strict;
use warnings;
# open with explicit direction '<', check the return value
# to make sure open succeeded. Using a lexical filehandle.
open my $fh, '<', 'file.txt' or die $!;
# read the whole file into an array and
# chomp all the lines at once
chomp(my #file = <$fh>);
close $fh;
my $ids = join(' ', #file);
my $nIds = scalar #file;
print "Number of lines: $nIds\n";
print "Text:\n$ids\n";
Reading the whole file into an array is suitable for small files only, otherwise it uses a lot of memory. Usually, line-by-line is preferred.
Variations:
print "#file" is equivalent to
$ids = join(' ',#file); print $ids;
$#file will return the last index
in #file. Since arrays usually start at 0,
$#file + 1 is equivalent to scalar #file.
You can also do:
my $ids;
do {
local $/;
$ids = <$fh>;
}
By temporarily "turning off" $/, the input record separator, i.e. newline, you will make <$fh> return the entire file. What <$fh> really does is read until it finds $/, then return that string. Note that this will preserve the newlines in $ids.
Line-by-line solution:
open my $fh, '<', 'file.txt' or die $!; # btw, $! contains the most recent error
my $ids;
while (<$fh>) {
chomp;
$ids .= "$_ "; # concatenate with string
}
my $nIds = $.; # $. is Current line number for the last filehandle accessed.
How do I print out my $ids and $nIds?
print "$ids\n";
print "$nIds\n";
I tried simply print $ids, but Perl complains.
Complains about what? Uninitialised value? Perhaps your loop was never entered due to an error opening the file. Be sure to check if open returned an error, and make sure you are using use strict; use warnings;.
my ($ids, $nIds) is a list, right? With two elements?
It's a (very special) function call. $ids,$nIds is a list with two elements.

Why doesn't Perl file glob() work outside of a loop in scalar context?

According to the Perl documentation on file globbing, the <*> operator or glob() function, when used in a scalar context, should iterate through the list of files matching the specified pattern, returning the next file name each time it is called or undef when there are no more files.
But, the iterating process only seems to work from within a loop. If it isn't in a loop, then it seems to start over immediately before all values have been read.
From the Perl docs:
In scalar context, glob iterates through such filename expansions, returning undef when the list is exhausted.
http://perldoc.perl.org/functions/glob.html
However, in scalar context the operator returns the next value each time it's called, or undef when the list has run out.
http://perldoc.perl.org/perlop.html#I/O-Operators
Example code:
use warnings;
use strict;
my $filename;
# in scalar context, <*> should return the next file name
# each time it is called or undef when the list has run out
$filename = <*>;
print "$filename\n";
$filename = <*>; # doesn't work as documented, starts over and
print "$filename\n"; # always returns the same file name
$filename = <*>;
print "$filename\n";
print "\n";
print "$filename\n" while $filename = <*>; # works in a loop, returns next file
# each time it is called
In a directory with 3 files...file1.txt, file2.txt, and file3.txt, the above code will output:
file1.txt
file1.txt
file1.txt
file1.txt
file2.txt
file3.txt
Note: The actual perl script should be outside the test directory, or you will see the file name of the script in the output as well.
Am I doing something wrong here, or is this how it is supposed to work?
Here's a way to capture the magic of the <> glob operator's state into an object that you can manipulate in a normal sort of way: anonymous subs (and/or closures)!
sub all_files {
return sub { scalar <*> };
}
my $iter = all_files();
print $iter->(), "\n";
print $iter->(), "\n";
print $iter->(), "\n";
or perhaps:
sub dir_iterator {
my $dir = shift;
return sub { scalar glob("$dir/*") };
}
my $iter = dir_iterator("/etc");
print $iter->(), "\n";
print $iter->(), "\n";
print $iter->(), "\n";
Then again my inclination is to file this under "curiosity". Ignore this particular oddity of glob() / <> and use opendir/readdir, IO::All/readdir, or File::Glob instead :)
The following code also seems to create 2 separate instances of the iterator...
for ( 1..3 )
{
$filename = <*>;
print "$filename\n" if defined $filename;
$filename = <*>;
print "$filename\n" if defined $filename;
}
I guess I see the logic there, but it is kind of counter intuitive and contradictory to the documentation. The docs don't mention anything about having to be in a loop for the iteration to work.
Also from perlop:
A (file)glob evaluates its (embedded) argument only when it is starting a new list.
Calling glob creates a list, which is either returned whole (in list context) or retrieved one element at a time (in scalar context). But each call to glob creates a separate list.
(Scratching away at my rusty memory of Perl...) I think that multiple lexical instances of <*> are treated as independent invokations of glob, whereas in the while loop you are invoking the same "instance" (whatever that means).
Imagine, for instance, if you did this:
while (<*>) { ... }
...
while (<*>) { ... }
You certainly wouldn't expect those two invocations to interfere with each other.