perl - string compare failing while fetching a line from a file. - perl

my code,
#!/usr/bin/perl -w
use strict;
use warnings;
my $codes=" ";
my $count=0;
my $str1="code1";
open (FILE, '/home/vpnuser/testFile.txt') or die("Could not open the file.");
while($codes=<FILE>)
{
print($codes);
if($codes eq $str1)
{
$count++;
}
}
print "$count";
the comparison always fails. my testFile.txt contains one simple line - code1
when i have written a separate perl script where i have two strings declared in the script it self rather than getting it from a file, the eq operator works fine. but when i am getting it from a file, there is a problem. Pease help,
Thanks in advance!

Don't forget to chomp your file input if you don't want it to end in a return character.
while(my $codes = <FILE>)
{
chomp $codes;
That is likely the reason why your string comparison is failing.
As on additional aside, kudus for including use strict; and use warnings; at the the top of your script, like one should always do.
I'd like to recommend that you also include use autodie; at the top as well when doing file processing. It will automatically give you a detailed error message for doing many kinds of operations, such as opening a file, so you won't have to remember to include the error code $! or the filename in your die statement.

Related

How to match and find common based on substring from two files?

I have two files. File1 contains list of email addresses. File2 contains list of domains.
I want to filter out all the email addresses after matching exact domain using Perl script.
I am using below code, but I don't get correct result.
#!/usr/bin/perl
#use strict;
#use warnings;
use feature 'say';
my $file1 = "/home/user/domain_file" or die " FIle not found\n";
my $file2 = "/home/user/email_address_file" or die " FIle not found\n";
my $match = open(MATCH, ">matching_domain") || die;
open(my $data1, '<', $file1) or die "Could not open '$file1' $!\n";
my #wrd = <$data1>;
chomp #wrd;
# loop on the fiile to be searched
open(my $data2, '<', $file2) or die "Could not open '$file2' $!\n";
while(my $line = <$data2>) {
chomp $line;
foreach (#wrd) {
if($line =~ /\#$_$/) {
print MATCH "$line\n";
}
}
}
File1
abc#1gmail.com.au
abc#gmail.com
abc#gmail.com1
abc#2outlook.com2
abc#outlook.com1
abc#yahoo.com
abc#yahooo1.com
abc#yahooo.com
File2
yahoo.com
gmail.com
Expected output
abc#gmail.com
abc#yahoo.com
First off, since you seem to be on *nix, you might want to check out grep -f, which can take search patterns from a given file. I'm no expert in grep, but I would try the file and "match whole words" and this should be fairly easy.
Second: Your Perl code can be improved, but it works as expected. If you put the emails and domains in the files as indicated by your code. It may be that you have mixed the files up.
If I run your code, fixing only the paths, and keeping the domains in file1, it does create the file matching_domain and it contains your expected output:
abc#gmail.com
abc#yahoo.com
So I don't know what you think your problem is (because you did not say). Maybe you were expecting it to print output to the terminal. Either way, it does work, but there are things to fix.
#use strict;
#use warnings;
It is a huge mistake to remove these two. Biggest mistake you will ever do while coding Perl. It will not remove your errors, just hide them. You will spend 10 times as much time bug fixing. Uncomment this as your first thing you do to fix this.
use feature 'say';
You never use this. You could for example replace print MATCH "$line\n" with say MATCH $line, which is slightly more concise.
my $file1 = "/home/user/domain_file" or die " FIle not found\n";
my $file2 = "/home/user/email_address_file" or die " FIle not found\n";
This is very incorrect. You are placing a condition on the creation of a variable. If the condition fails, does the variable exist? Don't do this. I assume this is to check if the file exists, but that is not what this does. To check if a file exists, you can use -e, documented as perldoc "-X" (various file tests).
Furthermore, a statement in the form of a string, "/home/user..." is TRUE ("truthy"), as far as Perl conditions are concerned. It is only false if it is "0" (zero), "" (empty) or undef (undefined). So your or clause will never be executed. E.g. "foo" or die will never die.
Lastly, this test is quite meaningless, as you will be testing this in your open statement later on anyway. If the file does not exist, the open will fail and your program will die.
my $match = open(MATCH, ">matching_domain") || die;
This is also very incorrect. First off, you never use the $match variable. Secondly, I bet it does not contain what you think it does. (it contains a boolean which states whether open was successful or not, see perldoc -f open) Thirdly, again, don't put conditions on my declarations of variables, it is a bad idea.
What this statement really means is that $match will contain either the return value of the open, or the return value of die. This should probably be simply:
open my $match, ">", "matching_domain" or die "Cannot open '$match': $!;
Also, use the three argument open with explicit open MODE, and use lexical file handles, like you have done elsewhere.
And one more thing on top of all the stuff I've already badgered you with: I don't recommend hard coding output files for small programs like this. If you want to redirect the output, use shell redirection: perl foo.pl > output.txt. I think this is what has prompted you to think something is wrong with your code: You don't see the output.
Other than that, your code is fine, as near as I can tell. You may want to chomp the lines from the domain file, but it should not matter. Also remember that indentation is a good thing, and it helps you read your code. I mentioned this in a comment, but it was removed for some reason. It is important though.
Good luck!
This assumes that the lines labeled File1 are in the file pointed to by $file1 and the lines labeled File2 are in the file pointed to by $file2.
You have your variables swapped. You want to match what is in $line against $_, not the other way around:
# loop on the file to be searched
open( my $data2, '<', $file2 ) or die "Could not open '$file2' $!\n";
while ( my $line = <$data2> ) {
chomp $line;
foreach (#wrd) {
if (/\#$line$/) {
print MATCH "$_\n";
}
}
}
You should un-comment the warnings and strict lines:
use strict;
use warnings;
warnings shows you that the or die checks are not really working the way you intended in the file name assignment statements. Just use :
my $file1 = "/home/user/domain_file";
my $file2 = "/home/user/email_address_file";
You are already doing the checks where they belong (on open).

perl: canot open file within a loop

I am trying to read in a bunch of similar files and process them one by one. Here is the code I have. But somehow the perl script doesn't read in the files correctly. I'm not sure how to fix it. The files are definitely readable and writable by me.
#!/usr/bin/perl
use strict;
use warnings;
my #olap_f = `ls /full_dir_to_file/*txt`;
foreach my $file (#olap_f){
my %traits_h;
open(IN,'<',$file) || die "cannot open $file";
while(<IN>){
chomp;
my #array = split /\t/;
my $trait = $array[4];
$traits_h{$trait} ++;
}
close IN;
}
When I run it, the error message (something like below) showed up:
cannot open /full_dir_to_file/a.txt
You have newlines at the end of each filename:
my #olap_f = `ls ~dir_to_file/*txt`;
chomp #olap_f; # Remove newlines
Better yet, use glob to avoid launching a new process (and having to trim newlines):
my #olap_f = glob "~dir_to_file/*txt";
Also, use $! to find out why a file couldn't be opened:
open(IN,'<',$file) || die "cannot open $file: $!";
This would have told you
cannot open /full_dir_to_file/a.txt
: No such file or directory
which might have made you recognize the unwanted newline.
I'll add a quick plug for IO::All here. It's important to know what's going on under the hood but it's convenient sometimes to be able to do:
use IO::All;
my #olap_f = io->dir('/full_dir_to_file/')->glob('*txt');
In this case it's not shorter than #cjm's use of glob but IO::All does have a few other convenient methods for working with files as well.

Perl script to parse a text file and match a string

I'm editing my question to add more details
The script executes the command and redirects the output to a text file.
The script then parses the text file to match the following string " Standard 1.1.1.1"
The output in the text file is :
Host Configuration
------------------
Profile Hostname
-------- ---------
standard 1.1.1.1
standard 1.1.1.2
The code works if i search for either 1.1.1.1 or standard . When i search for standard 1.1.1.1 together the below script fails.
this is the error that i get "Unable to find string: standard 172.25.44.241 at testtest.pl
#!/usr/bin/perl
use Net::SSH::Expect;
use strict;
use warnings;
use autodie;
open (HOSTRULES, ">hostrules.txt") || die "could not open output file";
my $hos = $ssh->exec(" I typed the command here ");
print HOSTRULES ($hos);
close(HOSTRULES);
sub find_string
{
my ($file, $string) = #_;
open my $fh, '<', $file;
while (<$fh>) {
return 1 if /\Q$string/;
}
die "Unable to find string: $string";
}
find_string('hostrules.txt', 'standard 1.1.1.1');
Perhaps write a function:
use strict;
use warnings;
use autodie;
sub find_string {
my ($file, $string) = #_;
open my $fh, '<', $file;
while (<$fh>) {
return 1 if /\Q$string/;
}
die "Unable to find string: $string";
}
find_string('output.txt', 'object-cache enabled');
Or just slurp the entire file:
use strict;
use warnings;
use autodie;
my $data = do {
open my $fh, '<', 'output.txt';
local $/;
<$fh>;
};
die "Unable to find string" if $data !~ /object-cache enabled/;
You're scanning a file for a particular string. If that string is not found in that file, you want an error thrown. Sounds like a job for grep.
use strict;
use warnings;
use features qw(say);
use autodie;
use constant {
OUTPUT_FILE => 'output.txt',
NEEDED_STRING => "object-cache enabled",
};
open my $out_fh, "<", OUTPUT_FILE;
my #output_lines = <$out_fh>;
close $out_fh;
chomp #output_lines;
grep { /#{[NEEDED_STRING]}/ } #output_lines or
die qq(ERROR! ERROR! ERROR!); #Or whatever you want
The die command will end the program and exit with a non-zero exit code. The error will be printed on STDERR.
I don't know why, but using qr(object-cache enabled), and then grep { NEEDED_STRING } didn't seem to work. Using #{[...]} allows you to interpolate constants.
Instead of constants, you might want to be able to pass in the error string and the name of the file using GetOptions.
I used the old fashion <...> file handling instead of IO::File, but that's because I'm an old fogy who learned Perl back in the 20th century before it was cool. You can use IO::File which is probably better and more modern.
ADDENDUM
Any reason for slurping the entire file in memory? - Leonardo Herrera
As long as the file is reasonably sized (say 100,000 lines or so), reading the entire file into memory shouldn't be that bad. However, you could use a loop:
use strict;
use warnings;
use features qw(say);
use autodie;
use constant {
OUTPUT_FILE => 'output.txt',
NEEDED_STRING => qr(object-cache enabled),
};
open my $out_fh, "<", OUTPUT_FILE;
my $output_string_found; # Flag to see if output string is found
while ( my $line = <$out_fh> ) {
if ( $line =~ NEEDED_STRING ){
$output_string_found = "Yup!"
last; # We found the string. No more looping.
}
}
die qq(ERROR, ERROR, ERROR) unless $output_string_found;
This will work with the constant NEEDED_STRING defined as a quoted regexp.
perl -ne '/object-cache enabled/ and $found++; END{ print "Object cache disabled\n" unless $found}' < input_file
This just reads the file a line at a time; if we find the key phrase, we increment $found. At the end, after we've read the whole file, we print the message unless we found the phrase.
If the message is insufficient, you can exit 1 unless $found instead.
I suggest this because there are two things to learn from this:
Perl provides good tools for doing basic filtering and data munging right at the command line.
Sometimes a simpler approach gets a solution out better and faster.
This absolutely isn't the perfect solution for every possible data extraction problem, but for this particular one, it's just what you need.
The -ne option flags tell Perl to set up a while loop to read all of standard input a line at a time, and to take any code following it and run it into the middle of that loop, resulting in a 'run this pattern match on each line in the file' program in a single command line.
END blocks can occur anywhere and are always run at the end of the program only, so defining it inside the while loop generated by -n is perfectly fine. When the program runs out of lines, we fall out the bottom of the while loop and run out of program, so Perl ends the program, triggering the execution of the END block to print (or not) the warning.
If the file you are searching contained a string that indicated the cache was disabled (the condition you want to catch), you could go even shorter:
perl -ne '/object-cache disabled/ and die "Object cache disabled\n"' < input_file
The program would scan the file only until it saw the indication that the cache was disabled, and would exit abnormally at that point.
First, why are you using Net::SSH::Expect? Are you executing a remote command? If not, all you need to execute a program and wait for its completion is system.
system("cmd > file.txt") or die "Couldn't execute: $!";
Second, it appears that what fails is your regular expression. You are searching for the literal expression standard 1.1.1.1 but in your sample text it appears that the wanted string contains either tabs or several spaces instead of a single space. Try changing your call to your find_string function:
find_string('hostrules.txt', 'standard\s+1.1.1.1'); # note '\s+' here

Saving Data that's Been Run Through ActivePerl

This must be a basic question, but I can't find a satisfactory answer to it. I have a script here that is meant to convert CSV formatted data to TSV. I've never used Perl before now and I need to know how to save the data that is printed after the Perl script runs it though.
Script below:
#!/usr/bin/perl
use warnings;
use strict;
my $filename = data.csv;
open FILE, $filename or die "can't open $filename: $!";
while (<FILE>) {
s/"//g;
s/,/\t/g;
s/Begin\.Time\.\.s\./Begin Time (s)/;
s/End\.Time\.\.s\./End Time (s)/;
s/Low\.Freq\.\.Hz\./Low Freq (Hz)/;
s/High\.Freq\.\.Hz\./High Freq (Hz)/;
s/Begin\.File/Begin File/;
s/File\.Offset\.\.s\./File Offset (s)/;
s/Random.Number/Random Number/;
s/Random.Percent/Random Percent/;
print;
}
All the data that's been analyzed is in the cmd prompt. How do I save this data?
edit:
thank you everyone! It worked perfectly!
From your cmd prompt:
perl yourscript.pl > C:\result.txt
Here you run the perl script and redirect the output to a file called result.txt
It's always potentially dangerous to treat all commas in a CSV file as field separators. CSV files can also include commas embedded within the data. Here's an example.
1,"Some data","Some more data"
2,"Another record","A field with, an embedded comma"
In your code, the line s/,/\t/g treats all tabs the same and the embedded comma in the final field will also be expanded to a tab. That's probably not what you want.
Here's some code that uses Text::ParseWords to do this correctly.
#!/usr/bin/perl
use strict;
use warnings;
use Text::ParseWords;
while (<>) {
my #line = parse_line(',', 0, $_);
$_ = join "\t", #line;
# All your s/.../.../ lines here
print;
}
If you run this, you'll see that the comma in the final field doesn't get updated.

How to open/join more than one file (depending on user input) and then use 2 files simultaneously

EDIT: Sorry for the misunderstanding, I have edited a few things, to hopefully actually request what I want.
I was wondering if there was a way to open/join two or more files to run the rest of the program on.
For example, my directory has these files:
taggedchpt1_1.txt, parsedchpt1_1.txt, taggedchpt1_2.txt, parsedchpt1_2.txt etc...
The program must call a tagged and parsed simultaneously. I want to run the program on both of chpt1_1 and chpt1_2, preferably joined together in one .txt file, unless it would be very slow to do so. For instance run what would be accomplished having two files:
taggedchpt1_1_and_chpt1_2 and parsedchpt1_1_and_chpt1_2
Can this be done through Perl? Or should I just combine the text files myself(or automate that process, making chpt1.txt which would include chpt1_1, chpt1_2, chpt1_3 etc...)
#!/usr/bin/perl
use strict;
use warnings FATAL => "all";
print "Please type in the chapter and section NUMBERS in the form chp#_sec#:\n"; ##So the user inputs 31_3, for example
chomp (my $chapter_and_section = "chpt".<>);
print "Please type in the search word:\n";
chomp (my $search_key = <>);
open(my $tag_corpus, '<', "tagged${chapter_and_section}.txt") or die $!;
open(my $parse_corpus, '<', "parsed${chapter_and_section}.txt") or die $!;
For the rest of the program to work, I need to be able to have:
my #sentences = <$tag_corpus>; ##right now this is one file, I want to make it more
my #typeddependencies = <$parse_corpus>; ##same as above
EDIT2: Really sorry about the misunderstanding. In the program, after the steps shown, I do 2 for loops. Reading through the lines of the tagged and parsed.
What I want is to accomplish this with more files from the same directory, without having to re-input the next files. (ie. I can run taggedchpt31_1.txt and parsedchpt31_1.txt...... I want to run taggedchpt31 and parsedchpt31 - which includes ~chpt31_1, ~chpt31_2, etc...)
Ultimately, it would be best if I joined all the tagged files and all the parsed files that have a common chapter (in the end still requiring only two files I want to run) but not have to save the joined file to the directory... Now that I put it into words, I think I should just save files that include all the sections.
Sorry and Thanks for all your time! Look at FMc's breakdown of my question for more help.
You could iterate over the file names, opening and reading each one in turn. Or you could produce an iterator that knows how to read lines from sequence of files.
sub files_reader {
# Takes a list of file names and returns a closure that
# will yield lines from those files.
my #handles = map { open(my $h, '<', $_) or die $!; $h } #_;
return sub {
shift #handles while #handles and eof $handles[0];
return unless #handles;
return readline $handles[0];
}
}
my $reader = files_reader('foo.txt', 'bar.txt', 'quux.txt');
while (my $line = $reader->()) {
print $line;
}
Or you could use Perl's built-in iterator that can do the same thing:
local #ARGV = ('foo.txt', 'bar.txt', 'quux.txt');
while (my $line = <>) {
print $line;
}
Edit in response to follow-up questions:
Perhaps it would help to break your problem down into smaller sub-tasks. As I understand it, you have three steps.
Step 1 is to get some input from the user -- perhaps a directory name, or maybe a couple of file name patterns (taggedchpt and parsedchpt).
Step 2 is for the program to find all of the relevant file names. For this task, glob() or readdir()might be useful. There are many questions on StackOverflow related to such issues. You'll end up with two lists of file names, one for the tagged files and one for the parsed files.
Step 3 is to process the lines across all of the files in each of the two sets. Most of the answers you have received, including mine, will help you with this step.
No one has mentioned the #ARGV hack yet? Ok, here it is.
{
local #ARGV = ('taggedchpt1_1.txt', 'parsedchpt1_1.txt', 'taggedchpt1_2.txt',
'parsedchpt1_2.txt');
while (<ARGV>) {
s/THIS/THAT/;
print FH $_;
}
}
ARGV is a special filehandle that iterates through all the filenames in #ARGV, closing a file and opening the next one as necessary. Normally #ARGV contains the command-line arguments that you passed to perl, but you can set it to anything you want.
You're almost there... this is a bit more efficient than discrete opens on each file...
#!/usr/bin/perl
use strict;
use warnings FATAL => "all";
print "Please type in the chapter and section NUMBERS in the for chp#_sec#:\n";
chomp (my $chapter_and_section = "chpt".<>);
print "Please type in the search word:\n";
chomp (my $search_key = <>);
open(FH, '>output.txt') or die $!; # Open an output file for writing
foreach ("tagged${chapter_and_section}.txt", "parsed${chapter_and_section}.txt") {
open FILE, "<$_" or die $!; # Read a filename (from the array)
foreach (<FILE>) {
$_ =~ s/THIS/THAT/g; # Regex replace each line in the open file (use
# whatever you like instead of "THIS" &
# "THAT"
print FH $_; # Write to the output file
}
}