I'm writing a program currently in Perl, and what I want it to do is take the user's input (a string in my case), read a file, find an entry in the file with that input, and then print.
This is what I have so far in this particular subroutine where this bit of code is:
print "What is your Sun Sign?\n";
open ( my $sun, '<', 'sunsigns.txt' ) or die "I couldn't reach the file. Please try again.";
while (my $sun_out = <STDIN>) {
if ($sun_out =~ /[\w$sun]/) {
print $sun;
When I do that, it gives me this:
I'm at my limit. I feel like I've literally tried everything that I've managed to Google. If anyone could at least point me in the right direction, please do.
Here is a common way to do this in Perl.
First, get the input string from STDIN.
Then, open the input file and loop over all lines looking for the input string. When you find the string, print the full line containing the string.
use warnings;
use strict;
print "What is your Sun Sign?\n";
my $sun = <STDIN>;
chomp $sun;
open (my $fh, '<', 'sunsigns.txt') or die "I couldn't reach the file. Please try again.";
while (<$fh>) {
if (/$sun/) {
See also perlintro
People have already answered the question about the task, but I'd like to point out why you got the output that you did.
First, you used the wrong variable in your print. Since $sun is a filehandle and that's the only argument you give to print, it outputs a text representation of that object. That's why you get the GLOB(...). I typically name all my handles with an h at the end. That's usually $fh for a file, $dh for a directory handle, and so on. That way these names stand out when I use them in the wrong place.
But, the various ways to call print is slightly annoying because print without any arguments uses the topic variable $_:
print; # outputs $_
Adding a filehandle as the first argument doesn't output $_ to that filehandle. I'm only slightly annoyed by this, but I get over it.
print $fh; # GLOB(...)
You have to add the argument yourself, but without a comma between them (for reasons I leave out):
print $fh $_;
If you have that comma, your problem returns:
print $fh, $_; GLOB(...)...
Given this file:
$ cat file
♈ Aries (Ram): March 21–April 19.
♉ Taurus (Bull): April 20–May 20.
♊ Gemini (Twins): May 21–June 21.
♋ Cancer (Crab): June 22–July 22.
♌ Leo (Lion): July 23–August 22.
♍ Virgo (Virgin): August 23–September 22.
♎ Libra (Balance): September 23–October 23.
♏ Scorpius (Scorpion): October 24–November 21.
You can do something like this:
use strict; use warnings;
print "Enter Sun Sign: ";
my $sign = <STDIN>;
chomp $sign;
open ( my $sun, '<', '/tmp/file' )or die "I couldn't reach the file. Please try again.";
while (my $line=<$sun>) {
chomp $line;
print $line if ($line=~/\b$sign\b/);
Test it:
Enter Sun Sign: Virgo
♍ Virgo (Virgin): August 23–September 22.
I have two files. File1 contains list of email addresses. File2 contains list of domains.
I want to filter out all the email addresses after matching exact domain using Perl script.
I am using below code, but I don't get correct result.
#use strict;
#use warnings;
use feature 'say';
my $file1 = "/home/user/domain_file" or die " FIle not found\n";
my $file2 = "/home/user/email_address_file" or die " FIle not found\n";
my $match = open(MATCH, ">matching_domain") || die;
open(my $data1, '<', $file1) or die "Could not open '$file1' $!\n";
my #wrd = <$data1>;
chomp #wrd;
# loop on the fiile to be searched
open(my $data2, '<', $file2) or die "Could not open '$file2' $!\n";
while(my $line = <$data2>) {
chomp $line;
foreach (#wrd) {
if($line =~ /\#$_$/) {
print MATCH "$line\n";
Expected output
First off, since you seem to be on *nix, you might want to check out grep -f, which can take search patterns from a given file. I'm no expert in grep, but I would try the file and "match whole words" and this should be fairly easy.
Second: Your Perl code can be improved, but it works as expected. If you put the emails and domains in the files as indicated by your code. It may be that you have mixed the files up.
If I run your code, fixing only the paths, and keeping the domains in file1, it does create the file matching_domain and it contains your expected output:
So I don't know what you think your problem is (because you did not say). Maybe you were expecting it to print output to the terminal. Either way, it does work, but there are things to fix.
#use strict;
#use warnings;
It is a huge mistake to remove these two. Biggest mistake you will ever do while coding Perl. It will not remove your errors, just hide them. You will spend 10 times as much time bug fixing. Uncomment this as your first thing you do to fix this.
use feature 'say';
You never use this. You could for example replace print MATCH "$line\n" with say MATCH $line, which is slightly more concise.
my $file1 = "/home/user/domain_file" or die " FIle not found\n";
my $file2 = "/home/user/email_address_file" or die " FIle not found\n";
This is very incorrect. You are placing a condition on the creation of a variable. If the condition fails, does the variable exist? Don't do this. I assume this is to check if the file exists, but that is not what this does. To check if a file exists, you can use -e, documented as perldoc "-X" (various file tests).
Furthermore, a statement in the form of a string, "/home/user..." is TRUE ("truthy"), as far as Perl conditions are concerned. It is only false if it is "0" (zero), "" (empty) or undef (undefined). So your or clause will never be executed. E.g. "foo" or die will never die.
Lastly, this test is quite meaningless, as you will be testing this in your open statement later on anyway. If the file does not exist, the open will fail and your program will die.
my $match = open(MATCH, ">matching_domain") || die;
This is also very incorrect. First off, you never use the $match variable. Secondly, I bet it does not contain what you think it does. (it contains a boolean which states whether open was successful or not, see perldoc -f open) Thirdly, again, don't put conditions on my declarations of variables, it is a bad idea.
What this statement really means is that $match will contain either the return value of the open, or the return value of die. This should probably be simply:
open my $match, ">", "matching_domain" or die "Cannot open '$match': $!;
Also, use the three argument open with explicit open MODE, and use lexical file handles, like you have done elsewhere.
And one more thing on top of all the stuff I've already badgered you with: I don't recommend hard coding output files for small programs like this. If you want to redirect the output, use shell redirection: perl foo.pl > output.txt. I think this is what has prompted you to think something is wrong with your code: You don't see the output.
Other than that, your code is fine, as near as I can tell. You may want to chomp the lines from the domain file, but it should not matter. Also remember that indentation is a good thing, and it helps you read your code. I mentioned this in a comment, but it was removed for some reason. It is important though.
Good luck!
This assumes that the lines labeled File1 are in the file pointed to by $file1 and the lines labeled File2 are in the file pointed to by $file2.
You have your variables swapped. You want to match what is in $line against $_, not the other way around:
# loop on the file to be searched
open( my $data2, '<', $file2 ) or die "Could not open '$file2' $!\n";
while ( my $line = <$data2> ) {
chomp $line;
foreach (#wrd) {
if (/\#$line$/) {
print MATCH "$_\n";
You should un-comment the warnings and strict lines:
use strict;
use warnings;
warnings shows you that the or die checks are not really working the way you intended in the file name assignment statements. Just use :
my $file1 = "/home/user/domain_file";
my $file2 = "/home/user/email_address_file";
You are already doing the checks where they belong (on open).
I am reading a file using the following code:
open ($myfile, "<file.txt") or die "Could not open the file";
#lines = <$myfile>;
foreach $line (#lines){
print $line;
close myfile;
The contents of the file are:
Crossroads Blues
Terraplane Blues
Come on in My Kitchen
Walking Blues
Mister Jelly Roll Maker
Last Fair Deal Gone Down
32-20 Blues
Kindhearted Woman Blues
If I Had Possession Over Judgement Day
Preaching Blues
Blind Willie's Blues
When You Got a Good Friend
Rambling on My Mind
Stones in My Passway
Wild Jelly Roll Blues
Traveling Riverside Blues
Roll My Jellyroll
Milkcow's Calf Blues
Me and the Devil Blues
Hellhound on My Trail
But the output of the program is:
Hellhound on My Trailsuesdudgement Day
It looks like the code reads only one line, and replaces the first characters with the new line that is read. I have tried different files. Only one line is printed, which is basically aggregated over all the lines.
Your original file has just a carriage-return (CR) at the end of each line when it should have a linefeed (LF) or possibly both CR and LF if it originated from a Windows system and you are reading it on Linux
Without any newlines to split up the data, #lines has only a single element which contains the entire file contents
Printing that text to the terminal results in all of the lines being displayed on top of one another as you have seen
You need to fix the creation of your file, but in the mean time you can read it correctly by changing Perl's record separator $/ like this
use strict;
use warnings 'all';
open my $fh, '<', 'file.txt' or die "Could not open the file: $!";
my #lines = do {
local $/ = "\r";
chomp #lines;
print "$_\n" for #lines;
Please check your original script and posted script are same.
You did mention the last line is only printing by your example program. It won't. It will print the whole lines.
Always put use warnings; and use strict; in top of the program.
Then storing the whole file into an array then read from an array is a very poor method. Use while loop instead.
open ($myfile, "<","file.txt") or die "Could not open the file";
print ; # Data are store into the default variable $_. So no need to mention the $_ in print statement.
The below script will produce the your mentioned output.
foreach (#lines)
$line = $_; # this or
#new = $_; # this
print $line; #last line
print #new; #last line
If you want to store the particular data into another variable, look at concatenation for string($) and push or unshift for an array(#)
This is my tab delimited input file
This is how I want my output file to look like
(yes duplicate the next two columns) My output file looks like this instead
What is going on with perl? This is my code.
open (IN, $ARGV[0]);
open (OUT, ">output.txt");
while ($line = <IN>){
chomp $line;
print OUT $line[1]."\t".$line[2]."\t".$line[2]."\n";
close( OUT);
First of all, you should always
use strict and use warnings for even the most trivial programs. You will also need to declare each of your variables using my as close as possible to their first use
use lexical file handles and the three-parameter form of open
check the success of every open call, and die with a string that includes $! to show the reason for the failure
Note also that there is no need to explicitly open files named on the command line that appear in #ARGV: you can just read from them using <>.
As others have said, it looks like you are reading a file of DOS or Windows origin on a Linux system. Instead of using chomp, you can remove all trailing whitespace characters from each line using s/\s+\z//. Since CR and LF both count as "whitespace", this will remove all line terminators from each record. Beware, however, that, if trailing space is significant or if the last field may be blank, then this will also remove spaces and tabs. In that case, s/[\r\n]+\z// is more appropriate.
This version of your program works fine.
use strict;
use warnings;
#ARGV = 'addr.txt';
open my $out, '>', 'output.txt' or die $!;
while (<>) {
my #fields = split /\t/;
print $out join("\t", #fields[1, 2, 2]), "\n";
close $out or die $!;
If you know beforehand the origin of your data file, and know it to be a DOS-like file that terminates records with CR LF, you can use the PerlIO crlf layer when you open the file. Like this
open my $in, '<:crlf', $ARGV[0] or die $!;
then all records will appear to end in just "\n" when they are read on a Linux system.
A general solution to this problem is to install PerlIO::eol. Then you can write
open my $in, '<:raw:eol(LF)', $ARGV[0] or die $!;
and the line ending will always be "\n" regardless of the origin of the file, and regardless of the platform where Perl is running.
Did you try to eliminate not only the "\n" but also the "\r"???
$file[2] =~ s/\r\n//g;
$file[3] =~ s/\r\n//g; # Is it the "good" one?
It could work. DOS line endings could also be "\r" (not only "\n").
Another way to avoid end of line problems is to only capture the characters you're interested in:
open (IN, $ARGV[0]);
open (OUT, ">output.txt");
while (<IN>) {
print OUT "$1\t$2\t$2\n" if /^(\w+)\t\w+\t(\w+)\s*/;
close( OUT);
I have the follwoing script
open IN, "/tmp/file";
s/(.*)=/$k{$1}++;"$1$k{$1}="/e and print while <IN>;
how to print the output of the script to file_out in place to print to standard output?
open IN, "/tmp/file";
open OUT, ">file_out.txt";
s/(.*)=/$k{$1}++;"$1$k{$1}="/e and print OUT while <IN>;
`open IN, "/tmp/file"
open command to open file
IN filehandle name
/tmp/file name of file and specifier that it is for reading
if there is no modifier, it means reading
if there is a <, i.e. "</tmp/file" it also means reading
`open OUT, ">file_out.txt"
open command to open file
OUT filehandle name
>file_out.txt name of file and specifier that it is for reading
there must be a >, i.e. ">file_out.txt" to write
s/.../.../e your substitution (I assume you know what it does)
and is a boolean operator that short-circuits, meaning it only does the thing afterwards if the thing beforehand is true. In this case, it will only print if the substitution actually matched something.
print OUT print to the filehandle OUT
while <IN> for each line from the file behind filehandle IN
Used this way, it makes extensive use of the magical default variable $_. Do a search for $_ on the perlintro site. In short:
If you don't tell a s/// substitution what string to work on, it uses $_
If you don't tell a print what to print, it prints $_
If you don't tell a while loop going through a filehandle's data where to put each line, it gets put into $_
Your program could have been rewritten:
open IN, "/tmp/file";
open OUT, ">file_out.txt";
while( defined( $line = <IN> ) )
$line =~ s/(.*)=/$k{$1}++;"$1$k{$1}="/e or next;
print OUT $line;
Simply add the filehandle you are printing to after the print statement; opening for writing is a small change from opening for reading:
#!/usr/bin/perl -w
open IN, "/tmp/file";
open OUT, '>', "/tmp/file_out";
s/(.*)=/Sk_$1_++;"$1Sk_$1_="/ and print OUT while <IN>;
(I munged the replacement a bit, so it was easier for me to test.)
From a related question asked by Bi, I've learnt how to print a matching line together with the line immediately below it. The code looks really simple:
while ($line = <FH>) {
if ($line =~ /Pattern/) {
print "$line";
print scalar <FH>;
I then searched Google for a different code that can print matching lines with the lines immediately above them. The code that would partially suit my purpose is something like this:
open(FH, "FILE");
while ( <FH> ) {
$my_line = "$_";
if ("$my_line" =~ /Pattern/) {
foreach( #array ){
print "$_\n";
print "$my_line\n"
if ( "$#array" > "0" ) {
Problem is I still can't figure out how to do them together. Seems my brain is shutting down. Does anyone have any ideas?
Thanks for any help.
I think I'm sort of touched. You guys are so helpful! Perhaps a little Off-topic, but I really feel the impulse to say more.
I needed a Windows program capable of searching the contents of multiple files and of displaying the related information without having to separately open each file. I tried googling and two apps, Agent Ransack and Devas, have proved to be useful, but they display only the lines containing the matched query and I want aslo to peek at the adjacent lines. Then the idea of improvising a program popped into my head. Years ago I was impressed by a Perl script that could generate a Tomeraider format of Wikipedia so that I can handily search Wiki on my Lifedrive and I've also read somewhere on the net that Perl is easy to learn especially for some guy like me who has no experience in any programming language. Then I sort of started teaching myself Perl a couple of days ago. My first step was to learn how to do the same job as "Agent Ransack" does and it proved to be not so difficult using Perl. I first learnt how to search the contents of a single file and display the matching lines through the modification of an example used in the book titled "Perl by Example", but I was stuck there. I became totally clueless as how to deal with multiple files. No similar examples were found in the book or probably because I was too impatient. And then I tried googling again and was led here and I asked my first question "How can I search multiple files for a string pattern in Perl?" here and I must say this forum is bloody AWESOME ;). Then I looked at more example scripts and then I came up with the following code yesterday and it serves my original purpose quite well:
The codes goes like this:
chop ($query = <STDIN>);
$dir = 'f:/corpus/';
#files = <$dir/*>;
foreach $file (#files) {
open (txt, "$file");
while($line = <txt>) {
if ($line =~ /$query/i) {
print "$file \n $line";
print scalar <txt>;
In the folder "corpus", I have a lot of text files including srt pdf doc files that contain such contents as follows:
Then I dumped the body.
J'ai mis le corps dans une décharge.
I know you have a wire.
Je sais que tu as un micro.
Now I'll tell you the truth.
Alors je vais te dire la vérité.
Basically I just need to search an English phrase and look at the French equivalent, so the script I finished yesterday is quite satisfying except that it would to be better if my script can display the above line in case I want to search a French phrase and check the English. So I'm trying to improve the code. Actually I knew the "print scalar " is buggy, but it is neat and does the job of printing the subsequent line at least most of the time). I was even expecting ANOTHER SINGLE magic line that prints the previous line instead of the subsequent :) Perl seems to be fun. I think I will spend more time trying to get a better understanding of it. And as suggested by daotoad, I'll study the codes generously offered by you guys. Again thanks you guys!
It will probably be easier just to use grep for this as it allows printing of lines before and after a match. Use -B and -A to print context before and after the match respectively. See http://ss64.com/bash/grep.html
Here's a modernized version of Pax's excellent answer:
use strict;
use warnings;
open( my $fh, '<', 'qq.in')
or die "Error opening file - $!\n";
my $this_line = "";
my $do_next = 0;
while(<$fh>) {
my $last_line = $this_line;
$this_line = $_;
if ($this_line =~ /XXX/) {
print $last_line unless $do_next;
print $this_line;
$do_next = 1;
} else {
print $this_line if $do_next;
$last_line = "";
$do_next = 0;
close ($fh);
See Why is three-argument open calls with lexical filehandles a Perl best practice? for an discussion of the reasons for the most important changes.
Important changes:
3 argument open.
lexical filehandle
added strict and warnings pragmas.
variables declared with lexical scope.
Minor changes (issues of style and personal taste):
removed unneeded parens from post-fix if
converted an if-not contstruct into unless.
If you find this answer useful, be sure to up-vote Pax's original.
Given the following input file:
(1:first) Yes, this one.
(2) This one as well (XXX).
(3) And this one.
Not this one.
Not this one.
Not this one.
(4) Yes, this one.
(5) This one as well (XXX).
(6) AND this one as well (XXX).
(7:last) And this one.
Not this one.
this little snippet:
open(FH, "<qq.in");
$this_line = "";
$do_next = 0;
while(<FH>) {
$last_line = $this_line;
$this_line = $_;
if ($this_line =~ /XXX/) {
print $last_line if (!$do_next);
print $this_line;
$do_next = 1;
} else {
print $this_line if ($do_next);
$last_line = "";
$do_next = 0;
close (FH);
produces the following, which is what I think you were after:
(1:first) Yes, this one.
(2) This one as well (XXX).
(3) And this one.
(4) Yes, this one.
(5) This one as well (XXX).
(6) AND this one as well (XXX).
(7:last) And this one.
It basically works by remembering the last line read and, when it finds the pattern, it outputs it and the pattern line. Then it continues to output pattern lines plus one more (with the $do_next variable).
There's also a little bit of trickery in there to ensure no line is printed twice.
You always want to store the last line that you saw in case the next line has your pattern and you need to print it. Using an array like you did in the second code snippet is probably overkill.
my $last = "";
while (my $line = <FH>) {
if ($line =~ /Pattern/) {
print $last;
print $line;
print scalar <FH>; # next line
$last = $line;
grep -A 1 -B 1 "search line"
I am going to ignore the title of your question and focus on some of the code you posted because it is positively harmful to let this code stand without explaining what is wrong with it. You say:
code that can print matching lines with the lines immediately above them. The code that would partially suit my purpose is something like this
I am going to go through that code. First, you should always include
use strict;
use warnings;
in your scripts, especially since you are just learning Perl.
This is a pointless statement. With strict, you can declare #array using:
my #array;
Prefer the three-argument form of open unless there is a specific benefit in a particular situation to not using it. Use lexical filehandles because bareword filehandles are package global and can be the source of mysterious bugs. Finally, always check if open succeeded before proceeding. So, instead of:
open(FH, "FILE");
my $filename = 'something';
open my $fh, '<', $filename
or die "Cannot open '$filename': $!";
If you use autodie, you can get away with:
open my $fh, '<', 'something';
Moving on:
while ( <FH> ) {
$my_line = "$_";
First, read the FAQ (you should have done so before starting to write programs). See What's wrong with always quoting "$vars"?. Second, if you are going to assign the line that you just read to $my_line, you should do it in the while statement so you do not needlessly touch $_. Finally, you can be strict compliant without typing any more characters:
while ( my $line = <$fh> ) {
chomp $line;
Refer to the previous FAQ again.
if ("$my_line" =~ /Pattern/) {
Why interpolate $my_line once more?
foreach( #array ){
print "$_\n";
Either use an explicit loop variable or turn this into:
print "$_\n" for #array;
So, you interpolate $my_line again and add the newline that was removed by chomp earlier. There is no reason to do so:
print "$my_line\n"
And now we come to the line that motivated me to dissect the code you posted in the first place:
if ( "$#array" > "0" ) {
$#array is a number. 0 is a number. > is used to check if the number on the LHS is greater than the number on the RHS. Therefore, there is no need to convert both operands to strings.
Further, $#array is the last index of #array and its meaning depends on the value of $[. I cannot figure out what this statement is supposed to be checking.
Now, your original problem statement was
print matching lines with the lines immediately above them
The natural question, of course, is how many lines "immediately above" the match you want to print.
use strict;
use warnings;
use Readonly;
Readonly::Scalar my $KEEP_BEFORE => 4;
my $filename = $ARGV[0];
my $pattern = qr/$ARGV[1]/;
open my $input_fh, '<', $filename
or die "Cannot open '$filename': $!";
my #before;
while ( my $line = <$input_fh> ) {
$line = sprintf '%6d: %s', $., $line;
print #before, $line, "\n" if $line =~ $pattern;
push #before, $line;
shift #before if #before > $KEEP_BEFORE;
close $input_fh;
Command line grep is the quickest way to accomplish this, but if your goal is to learn some Perl then you'll need to produce some code.
Rather than providing code, as others have already done, I'll talk a bit about how to write your own. I hope this helps with the brain-lock.
Read my previous answer on how to write a program, it gives some tips about how to start working on your problem.
Go through each of the sample programs you have, as well as those offered here and comment out exactly what they do. Refer to the perldoc for each function and operator you don't understand. Your first example code has an error, if 2 lines in a row match, the line after the second match won't print. By error, I mean that either the code or the spec is wrong, the desired behavior in this case needs to be determined.
Write out what you want your program to do.
Start filling in the blanks with code.
Here's a sketch of a phase one write-up:
# This program reads a file and looks for lines that match a pattern.
# Open the file
# Iterate over the file
# For each line
# Check for a match
# If match print line before, line and next line.
But how do you get the next line and the previous line?
Here's where creative thinking comes in, there are many ways, all you need is one that works.
You could read in lines one at a time, but read ahead by one line.
You could read the whole file into memory and select previous and follow-on lines by indexing an array.
You could read the file and store the offset and length each line--keeping track of which ones match as you go. Then use your offset data to extract the required lines.
You could read in lines one at a time. Cache your previous line as you go. Use readline to read the next line for printing, but use seek and tell to rewind the handle so that the 'next' line can be checked for a match.
Any of these methods, and many more could be fleshed out into a functioning program. Depending on your goals, and constraints any one may be the best choice for that problem domain. Knowing how to select which one to use will come with experience. If you have time, try two or three different ways and see how they work out.
Good luck.
If you don't mind losing the ability to iterate over a filehandle, you could just slurp the file and iterate over the array:
use strict; # always do these
use warnings;
my $range = 1; # change this to print the first and last X lines
open my $fh, '<', 'FILE' or die "Error: $!";
my #file = <$fh>;
close $fh;
for (0 .. $#file) {
if($file[$_] =~ /Pattern/) {
my #lines = grep { $_ > 0 && $_ < $#file } $_ - $range .. $_ + $range;
print #file[#lines];
This might get horribly slow for large files, but is pretty easy to understand (in my opinion). Only when you know how it works can you set about trying to optimize it. If you have any questions about any of the functions or operations I used, just ask.